{"title":"使用多层特征聚合直方图检索图像","authors":"Fen Lu, Guang-Hai Liu, Xiao-Zhi Gao","doi":"10.1007/s12559-024-10334-9","DOIUrl":null,"url":null,"abstract":"<p>Aggregating the diverse features into a compact representation is a hot issue in image retrieval. However, aggregating the differential feature of multilayer into a discriminative representation remains challenging. Inspired by the value-guided neural mechanisms, a novel representation method, namely, the <i>multilayer feature aggregation histogram</i> was proposed to image retrieval. It can aggregate multilayer features, such as low-, mid-, and high-layer features, into a discriminative yet compact representation via simulating the neural mechanisms that mediate the ability to make value-guided decisions. The highlights of the proposed method have the following: (1) A <i>detail-attentive map</i> was proposed to represent the aggregation of low- and mid-layer features. It can be well used to evaluate the distinguishable detail feature. (2) A simple yet straightforward aggregation method is proposed to re-evaluate the distinguishable high-layer feature. It can provide aggregated features including detail, object, and semantic by using <i>semantic-attentive map</i>. (3) A novel whitening method, namely <i>difference whitening</i>, is introduced to reduce dimensionality. It did not need to seek a training dataset of semantical similarity and can provide a compact yet discriminative representation. Experiments on the popular benchmark datasets demonstrate the proposed method can obviously increase retrieval performance in terms of mAP metric. The proposed method using 128-dimensionality representation can provide significantly higher mAPs than the DSFH, DWDF, and OSAH methods by 0.083, 0.043, and 0.022 on the Oxford5k dataset and by 0.195, 0.036, and 0.071 on the Paris6k dataset. The difference whitening method can conveniently transfer the deep learning model to a new task. Our method provided competitive performance compared with the existing aggregation methods and can retrieve scene images with similar colors, objects, and semantics.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Image Retrieval Using Multilayer Feature Aggregation Histogram\",\"authors\":\"Fen Lu, Guang-Hai Liu, Xiao-Zhi Gao\",\"doi\":\"10.1007/s12559-024-10334-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Aggregating the diverse features into a compact representation is a hot issue in image retrieval. However, aggregating the differential feature of multilayer into a discriminative representation remains challenging. Inspired by the value-guided neural mechanisms, a novel representation method, namely, the <i>multilayer feature aggregation histogram</i> was proposed to image retrieval. It can aggregate multilayer features, such as low-, mid-, and high-layer features, into a discriminative yet compact representation via simulating the neural mechanisms that mediate the ability to make value-guided decisions. The highlights of the proposed method have the following: (1) A <i>detail-attentive map</i> was proposed to represent the aggregation of low- and mid-layer features. It can be well used to evaluate the distinguishable detail feature. (2) A simple yet straightforward aggregation method is proposed to re-evaluate the distinguishable high-layer feature. It can provide aggregated features including detail, object, and semantic by using <i>semantic-attentive map</i>. (3) A novel whitening method, namely <i>difference whitening</i>, is introduced to reduce dimensionality. It did not need to seek a training dataset of semantical similarity and can provide a compact yet discriminative representation. Experiments on the popular benchmark datasets demonstrate the proposed method can obviously increase retrieval performance in terms of mAP metric. The proposed method using 128-dimensionality representation can provide significantly higher mAPs than the DSFH, DWDF, and OSAH methods by 0.083, 0.043, and 0.022 on the Oxford5k dataset and by 0.195, 0.036, and 0.071 on the Paris6k dataset. The difference whitening method can conveniently transfer the deep learning model to a new task. Our method provided competitive performance compared with the existing aggregation methods and can retrieve scene images with similar colors, objects, and semantics.</p>\",\"PeriodicalId\":51243,\"journal\":{\"name\":\"Cognitive Computation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cognitive Computation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s12559-024-10334-9\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Computation","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12559-024-10334-9","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Image Retrieval Using Multilayer Feature Aggregation Histogram
Aggregating the diverse features into a compact representation is a hot issue in image retrieval. However, aggregating the differential feature of multilayer into a discriminative representation remains challenging. Inspired by the value-guided neural mechanisms, a novel representation method, namely, the multilayer feature aggregation histogram was proposed to image retrieval. It can aggregate multilayer features, such as low-, mid-, and high-layer features, into a discriminative yet compact representation via simulating the neural mechanisms that mediate the ability to make value-guided decisions. The highlights of the proposed method have the following: (1) A detail-attentive map was proposed to represent the aggregation of low- and mid-layer features. It can be well used to evaluate the distinguishable detail feature. (2) A simple yet straightforward aggregation method is proposed to re-evaluate the distinguishable high-layer feature. It can provide aggregated features including detail, object, and semantic by using semantic-attentive map. (3) A novel whitening method, namely difference whitening, is introduced to reduce dimensionality. It did not need to seek a training dataset of semantical similarity and can provide a compact yet discriminative representation. Experiments on the popular benchmark datasets demonstrate the proposed method can obviously increase retrieval performance in terms of mAP metric. The proposed method using 128-dimensionality representation can provide significantly higher mAPs than the DSFH, DWDF, and OSAH methods by 0.083, 0.043, and 0.022 on the Oxford5k dataset and by 0.195, 0.036, and 0.071 on the Paris6k dataset. The difference whitening method can conveniently transfer the deep learning model to a new task. Our method provided competitive performance compared with the existing aggregation methods and can retrieve scene images with similar colors, objects, and semantics.
期刊介绍:
Cognitive Computation is an international, peer-reviewed, interdisciplinary journal that publishes cutting-edge articles describing original basic and applied work involving biologically-inspired computational accounts of all aspects of natural and artificial cognitive systems. It provides a new platform for the dissemination of research, current practices and future trends in the emerging discipline of cognitive computation that bridges the gap between life sciences, social sciences, engineering, physical and mathematical sciences, and humanities.