Jiwei Nie , Qǐxı̄ Zhào , Dingyu Xue , Feng Pan , Wei Liu
{"title":"EPSA-VPR:一种基于高效斑块显著性加权聚合器的轻量级视觉位置识别方法","authors":"Jiwei Nie , Qǐxı̄ Zhào , Dingyu Xue , Feng Pan , Wei Liu","doi":"10.1016/j.jvcir.2025.104440","DOIUrl":null,"url":null,"abstract":"<div><div>Visual Place Recognition (VPR) is important in autonomous driving, as it enables vehicles to identify their positions using a pre-built database. In this domain, prior research highlights the advantages of recognizing and emphasizing high-saliency local features in descriptor aggregation for performance improvement. Following this path, we introduce EPSA-VPR, a lightweight VPR method incorporating a proposed Efficient Patch Saliency-weighted Aggregator (EPSA), additionally addressing the computational efficiency demands of large-scale scenarios. With almost negligible computational requirements, EPSA efficiently calculates and integrates the local saliency into the global descriptor. To quantitatively evaluate the effectiveness, EPSA-VPR is validated across various VPR benchmarks. The comprehensive evaluations confirm that our method outperforms existing advanced VPR technologies and achieves competitive performance. Notably, EPSA-VPR also derives the second-best performance among two-stage VPR methods, without the need for re-ranking computations. Moreover, the effectiveness of our model is sustainable even with considerable dimension reduction. Visualization analysis reveals the interpretability of EPSA-VPR that after training, the backbone network learns to attach more attention on the task-related elements, which makes the final descriptor more discriminative.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"110 ","pages":"Article 104440"},"PeriodicalIF":2.6000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EPSA-VPR: A lightweight visual place recognition method with an Efficient Patch Saliency-weighted Aggregator\",\"authors\":\"Jiwei Nie , Qǐxı̄ Zhào , Dingyu Xue , Feng Pan , Wei Liu\",\"doi\":\"10.1016/j.jvcir.2025.104440\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Visual Place Recognition (VPR) is important in autonomous driving, as it enables vehicles to identify their positions using a pre-built database. In this domain, prior research highlights the advantages of recognizing and emphasizing high-saliency local features in descriptor aggregation for performance improvement. Following this path, we introduce EPSA-VPR, a lightweight VPR method incorporating a proposed Efficient Patch Saliency-weighted Aggregator (EPSA), additionally addressing the computational efficiency demands of large-scale scenarios. With almost negligible computational requirements, EPSA efficiently calculates and integrates the local saliency into the global descriptor. To quantitatively evaluate the effectiveness, EPSA-VPR is validated across various VPR benchmarks. The comprehensive evaluations confirm that our method outperforms existing advanced VPR technologies and achieves competitive performance. Notably, EPSA-VPR also derives the second-best performance among two-stage VPR methods, without the need for re-ranking computations. Moreover, the effectiveness of our model is sustainable even with considerable dimension reduction. Visualization analysis reveals the interpretability of EPSA-VPR that after training, the backbone network learns to attach more attention on the task-related elements, which makes the final descriptor more discriminative.</div></div>\",\"PeriodicalId\":54755,\"journal\":{\"name\":\"Journal of Visual Communication and Image Representation\",\"volume\":\"110 \",\"pages\":\"Article 104440\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Visual Communication and Image Representation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1047320325000549\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visual Communication and Image Representation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1047320325000549","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
EPSA-VPR: A lightweight visual place recognition method with an Efficient Patch Saliency-weighted Aggregator
Visual Place Recognition (VPR) is important in autonomous driving, as it enables vehicles to identify their positions using a pre-built database. In this domain, prior research highlights the advantages of recognizing and emphasizing high-saliency local features in descriptor aggregation for performance improvement. Following this path, we introduce EPSA-VPR, a lightweight VPR method incorporating a proposed Efficient Patch Saliency-weighted Aggregator (EPSA), additionally addressing the computational efficiency demands of large-scale scenarios. With almost negligible computational requirements, EPSA efficiently calculates and integrates the local saliency into the global descriptor. To quantitatively evaluate the effectiveness, EPSA-VPR is validated across various VPR benchmarks. The comprehensive evaluations confirm that our method outperforms existing advanced VPR technologies and achieves competitive performance. Notably, EPSA-VPR also derives the second-best performance among two-stage VPR methods, without the need for re-ranking computations. Moreover, the effectiveness of our model is sustainable even with considerable dimension reduction. Visualization analysis reveals the interpretability of EPSA-VPR that after training, the backbone network learns to attach more attention on the task-related elements, which makes the final descriptor more discriminative.
期刊介绍:
The Journal of Visual Communication and Image Representation publishes papers on state-of-the-art visual communication and image representation, with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems.