{"title":"Robust and Efficient Shoe Print Image Retrieval Using Spatial Transformer Network and Deep Hashing","authors":"Wei Liu, Dawei Xu","doi":"10.1145/3532342.3532356","DOIUrl":null,"url":null,"abstract":"In recent years, great progress has been made on the topic of shoe print image retrieval. However, it still remains a big challenge to accurately retrieve well-matched shoe print images from a huge database in real time. Deep hashing method has been proved to be effective in large-scale image retrieval in many applications. Output of deep hash network can be represented as a binary bit hash code, which helps to reduce storage space and retrieval time. Existing shoe print image retrieval methods have poor performance in instance retrieval because of shortage of classification information, sufficient sample information and feature quantization information. In order to overcome the problem, we put forward an end-to-end network to learn short deep hash codes. The learned hash code preserves the classification and small sample information very well. Moreover, due to the fact that STN (Spatial Transformer Network) block is simultaneously embedded into the hash network to enhance the retrieval ability for rotated shoe print images, the problem of rotation misalignment can be solved and the retrieval accuracy is improved. Furthermore, in order to make better use of class label information, we presented a new joint loss function. This loss function helps the network map both images’ classification information and similarities into hash codes and reduce quantitative loss. In addition, we used triple labels to alleviate sample imbalance problem. Experiments on database including 10,500 shoe print images show that our proposed method can improve the retrieval performance. The proposed approach can yield a mAP (mean Average Precision) of 0.83 and a recall of 0.35, which demonstrates the discriminatory power of the learned hash codes in shoe print image retrieval application.","PeriodicalId":398859,"journal":{"name":"Proceedings of the 4th International Symposium on Signal Processing Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Symposium on Signal Processing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3532342.3532356","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In recent years, great progress has been made on the topic of shoe print image retrieval. However, it still remains a big challenge to accurately retrieve well-matched shoe print images from a huge database in real time. Deep hashing method has been proved to be effective in large-scale image retrieval in many applications. Output of deep hash network can be represented as a binary bit hash code, which helps to reduce storage space and retrieval time. Existing shoe print image retrieval methods have poor performance in instance retrieval because of shortage of classification information, sufficient sample information and feature quantization information. In order to overcome the problem, we put forward an end-to-end network to learn short deep hash codes. The learned hash code preserves the classification and small sample information very well. Moreover, due to the fact that STN (Spatial Transformer Network) block is simultaneously embedded into the hash network to enhance the retrieval ability for rotated shoe print images, the problem of rotation misalignment can be solved and the retrieval accuracy is improved. Furthermore, in order to make better use of class label information, we presented a new joint loss function. This loss function helps the network map both images’ classification information and similarities into hash codes and reduce quantitative loss. In addition, we used triple labels to alleviate sample imbalance problem. Experiments on database including 10,500 shoe print images show that our proposed method can improve the retrieval performance. The proposed approach can yield a mAP (mean Average Precision) of 0.83 and a recall of 0.35, which demonstrates the discriminatory power of the learned hash codes in shoe print image retrieval application.