Vinicius Sato Kawai, Lucas Pascotti Valem, Alexandro Baldassin, Edson Borin, Daniel Carlos Guimarães Pedronette, Longin Jan Latecki
{"title":"Rank-based Hashing for Effective and Efficient Nearest Neighbor Search for Image Retrieval","authors":"Vinicius Sato Kawai, Lucas Pascotti Valem, Alexandro Baldassin, Edson Borin, Daniel Carlos Guimarães Pedronette, Longin Jan Latecki","doi":"10.1145/3659580","DOIUrl":null,"url":null,"abstract":"<p>The large and growing amount of digital data creates a pressing need for approaches capable of indexing and retrieving multimedia content. A traditional and fundamental challenge consists of effectively and efficiently performing nearest-neighbor searches. After decades of research, several different methods are available, including trees, hashing, and graph-based approaches. Most of the current methods exploit learning to hash approaches based on deep learning. In spite of effective results and compact codes obtained, such methods often require a significant amount of labeled data for training. Unsupervised approaches also rely on expensive training procedures usually based on a huge amount of data. In this work, we propose an unsupervised data-independent approach for nearest neighbor searches, which can be used with different features, including deep features trained by transfer learning. The method uses a rank-based formulation and exploits a hashing approach for efficient ranked list computation at query time. A comprehensive experimental evaluation was conducted on 7 public datasets, considering deep features based on CNNs and Transformers. Both effectiveness and efficiency aspects were evaluated. The proposed approach achieves remarkable results in comparison to traditional and state-of-the-art methods. Hence, it is an attractive and innovative solution, especially when costly training procedures need to be avoided.</p>","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"17 1","pages":""},"PeriodicalIF":5.2000,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Multimedia Computing Communications and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3659580","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The large and growing amount of digital data creates a pressing need for approaches capable of indexing and retrieving multimedia content. A traditional and fundamental challenge consists of effectively and efficiently performing nearest-neighbor searches. After decades of research, several different methods are available, including trees, hashing, and graph-based approaches. Most of the current methods exploit learning to hash approaches based on deep learning. In spite of effective results and compact codes obtained, such methods often require a significant amount of labeled data for training. Unsupervised approaches also rely on expensive training procedures usually based on a huge amount of data. In this work, we propose an unsupervised data-independent approach for nearest neighbor searches, which can be used with different features, including deep features trained by transfer learning. The method uses a rank-based formulation and exploits a hashing approach for efficient ranked list computation at query time. A comprehensive experimental evaluation was conducted on 7 public datasets, considering deep features based on CNNs and Transformers. Both effectiveness and efficiency aspects were evaluated. The proposed approach achieves remarkable results in comparison to traditional and state-of-the-art methods. Hence, it is an attractive and innovative solution, especially when costly training procedures need to be avoided.
期刊介绍:
The ACM Transactions on Multimedia Computing, Communications, and Applications is the flagship publication of the ACM Special Interest Group in Multimedia (SIGMM). It is soliciting paper submissions on all aspects of multimedia. Papers on single media (for instance, audio, video, animation) and their processing are also welcome.
TOMM is a peer-reviewed, archival journal, available in both print form and digital form. The Journal is published quarterly; with roughly 7 23-page articles in each issue. In addition, all Special Issues are published online-only to ensure a timely publication. The transactions consists primarily of research papers. This is an archival journal and it is intended that the papers will have lasting importance and value over time. In general, papers whose primary focus is on particular multimedia products or the current state of the industry will not be included.