Robust and Efficient Shoe Print Image Retrieval Using Spatial Transformer Network and Deep Hashing

Proceedings of the 4th International Symposium on Signal Processing Systems Pub Date : 2022-03-25 DOI:10.1145/3532342.3532356

Wei Liu, Dawei Xu

{"title":"Robust and Efficient Shoe Print Image Retrieval Using Spatial Transformer Network and Deep Hashing","authors":"Wei Liu, Dawei Xu","doi":"10.1145/3532342.3532356","DOIUrl":null,"url":null,"abstract":"In recent years, great progress has been made on the topic of shoe print image retrieval. However, it still remains a big challenge to accurately retrieve well-matched shoe print images from a huge database in real time. Deep hashing method has been proved to be effective in large-scale image retrieval in many applications. Output of deep hash network can be represented as a binary bit hash code, which helps to reduce storage space and retrieval time. Existing shoe print image retrieval methods have poor performance in instance retrieval because of shortage of classification information, sufficient sample information and feature quantization information. In order to overcome the problem, we put forward an end-to-end network to learn short deep hash codes. The learned hash code preserves the classification and small sample information very well. Moreover, due to the fact that STN (Spatial Transformer Network) block is simultaneously embedded into the hash network to enhance the retrieval ability for rotated shoe print images, the problem of rotation misalignment can be solved and the retrieval accuracy is improved. Furthermore, in order to make better use of class label information, we presented a new joint loss function. This loss function helps the network map both images’ classification information and similarities into hash codes and reduce quantitative loss. In addition, we used triple labels to alleviate sample imbalance problem. Experiments on database including 10,500 shoe print images show that our proposed method can improve the retrieval performance. The proposed approach can yield a mAP (mean Average Precision) of 0.83 and a recall of 0.35, which demonstrates the discriminatory power of the learned hash codes in shoe print image retrieval application.","PeriodicalId":398859,"journal":{"name":"Proceedings of the 4th International Symposium on Signal Processing Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Symposium on Signal Processing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3532342.3532356","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

In recent years, great progress has been made on the topic of shoe print image retrieval. However, it still remains a big challenge to accurately retrieve well-matched shoe print images from a huge database in real time. Deep hashing method has been proved to be effective in large-scale image retrieval in many applications. Output of deep hash network can be represented as a binary bit hash code, which helps to reduce storage space and retrieval time. Existing shoe print image retrieval methods have poor performance in instance retrieval because of shortage of classification information, sufficient sample information and feature quantization information. In order to overcome the problem, we put forward an end-to-end network to learn short deep hash codes. The learned hash code preserves the classification and small sample information very well. Moreover, due to the fact that STN (Spatial Transformer Network) block is simultaneously embedded into the hash network to enhance the retrieval ability for rotated shoe print images, the problem of rotation misalignment can be solved and the retrieval accuracy is improved. Furthermore, in order to make better use of class label information, we presented a new joint loss function. This loss function helps the network map both images’ classification information and similarities into hash codes and reduce quantitative loss. In addition, we used triple labels to alleviate sample imbalance problem. Experiments on database including 10,500 shoe print images show that our proposed method can improve the retrieval performance. The proposed approach can yield a mAP (mean Average Precision) of 0.83 and a recall of 0.35, which demonstrates the discriminatory power of the learned hash codes in shoe print image retrieval application.

查看原文本刊更多论文

基于空间变压器网络和深度哈希的鲁棒高效鞋印图像检索

近年来，在鞋印图像检索方面取得了很大的进展。然而，如何从庞大的数据库中实时准确地检索出匹配良好的鞋印图像，仍然是一个巨大的挑战。在许多应用中，深度哈希方法已被证明是有效的大规模图像检索方法。深度哈希网络的输出可以表示为二进制哈希码，这有助于减少存储空间和检索时间。现有的鞋印图像检索方法由于缺乏分类信息、样本信息和特征量化信息，在实例检索方面表现不佳。为了克服这个问题，我们提出了一个端到端学习短深度哈希码的网络。学习的哈希码很好地保留了分类和小样本信息。此外，由于在哈希网络中同时嵌入STN (Spatial Transformer Network)块，增强了对旋转脚印图像的检索能力，解决了旋转错位问题，提高了检索精度。此外，为了更好地利用类标签信息，我们提出了一种新的联合损失函数。该损失函数帮助网络将图像的分类信息和相似度映射到哈希码中，减少定量损失。此外，我们使用三重标签来缓解样品不平衡问题。在包含10500张鞋印图像的数据库上进行的实验表明，该方法可以提高检索性能。该方法的mAP (mean Average Precision)为0.83，召回率为0.35，证明了该算法在鞋印图像检索应用中的鉴别能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 4th International Symposium on Signal Processing Systems

自引率

0.00%

发文量