Place recognition meet multiple modalities: a comprehensive review, current challenges and future development

IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Zhenyu Li, Tianyi Shang, Pengjie Xu, Zhaojun Deng
{"title":"Place recognition meet multiple modalities: a comprehensive review, current challenges and future development","authors":"Zhenyu Li,&nbsp;Tianyi Shang,&nbsp;Pengjie Xu,&nbsp;Zhaojun Deng","doi":"10.1007/s10462-025-11367-8","DOIUrl":null,"url":null,"abstract":"<div><p>Place recognition is a cornerstone of vehicle navigation and mapping, which is pivotal in enabling systems to determine whether a location has been previously visited. This capability is critical for tasks such as loop closure in Simultaneous Localization and Mapping (SLAM) and long-term navigation under varying environmental conditions. This survey comprehensively reviews recent advancements in place recognition, emphasizing three representative methodological paradigms: Convolutional Neural Network (CNN)-based approaches, Transformer-based frameworks, and cross-modal strategies. We begin by elucidating the significance of place recognition within the broader context of autonomous systems. Subsequently, we trace the evolution of CNN-based methods, highlighting their contributions to robust visual descriptor learning and scalability in large-scale environments. We then examine the emerging class of Transformer-based models, which leverage self-attention mechanisms to capture global dependencies and offer improved generalization across diverse scenes. Furthermore, we discuss cross-modal approaches that integrate heterogeneous data sources such as Lidar, vision, and text description, thereby enhancing resilience to viewpoint, illumination, and seasonal variations. We also summarize standard datasets and evaluation metrics widely adopted in the literature. To the best of our knowledge, no prior survey has systematically reviewed visual, LiDAR, and cross-modal place recognition concurrently. This work thus resolves a critical gap in existing literature dominated by single-modality studies. Finally, we identify current research challenges and outline prospective directions, including domain adaptation, real-time performance, and lifelong learning, to inspire future advancements in this domain. The unified framework of leading-edge place recognition methods, i.e., code library, and the results of their experimental evaluations are available at https://github.com/CV4RA/SOTA-Place-Recognitioner.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 11","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11367-8.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11367-8","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Place recognition is a cornerstone of vehicle navigation and mapping, which is pivotal in enabling systems to determine whether a location has been previously visited. This capability is critical for tasks such as loop closure in Simultaneous Localization and Mapping (SLAM) and long-term navigation under varying environmental conditions. This survey comprehensively reviews recent advancements in place recognition, emphasizing three representative methodological paradigms: Convolutional Neural Network (CNN)-based approaches, Transformer-based frameworks, and cross-modal strategies. We begin by elucidating the significance of place recognition within the broader context of autonomous systems. Subsequently, we trace the evolution of CNN-based methods, highlighting their contributions to robust visual descriptor learning and scalability in large-scale environments. We then examine the emerging class of Transformer-based models, which leverage self-attention mechanisms to capture global dependencies and offer improved generalization across diverse scenes. Furthermore, we discuss cross-modal approaches that integrate heterogeneous data sources such as Lidar, vision, and text description, thereby enhancing resilience to viewpoint, illumination, and seasonal variations. We also summarize standard datasets and evaluation metrics widely adopted in the literature. To the best of our knowledge, no prior survey has systematically reviewed visual, LiDAR, and cross-modal place recognition concurrently. This work thus resolves a critical gap in existing literature dominated by single-modality studies. Finally, we identify current research challenges and outline prospective directions, including domain adaptation, real-time performance, and lifelong learning, to inspire future advancements in this domain. The unified framework of leading-edge place recognition methods, i.e., code library, and the results of their experimental evaluations are available at https://github.com/CV4RA/SOTA-Place-Recognitioner.

地点识别满足多种模式:综合回顾、当前挑战和未来发展
地点识别是车辆导航和地图绘制的基础,对于系统确定某个地点是否曾经被访问过至关重要。这种能力对于同时定位和绘图(SLAM)中的闭环以及在不同环境条件下的长期导航等任务至关重要。本调查全面回顾了位置识别的最新进展,强调了三种代表性的方法范式:基于卷积神经网络(CNN)的方法,基于transformer的框架和跨模态策略。我们首先阐明位置识别在自治系统的更广泛的背景下的意义。随后,我们追溯了基于cnn的方法的演变,强调了它们对大规模环境中鲁棒视觉描述符学习和可扩展性的贡献。然后,我们研究了新兴的基于transformer的模型类,它利用自关注机制来捕获全局依赖关系,并在不同的场景中提供改进的泛化。此外,我们还讨论了整合异构数据源(如激光雷达、视觉和文本描述)的跨模式方法,从而增强了对视点、光照和季节变化的适应能力。我们还总结了文献中广泛采用的标准数据集和评估指标。据我们所知,没有任何先前的调查系统地同时审查了视觉,激光雷达和跨模态位置识别。因此,这项工作解决了现有文献中以单模态研究为主的一个关键空白。最后,我们确定了当前的研究挑战并概述了未来的研究方向,包括领域适应、实时性能和终身学习,以激励该领域的未来发展。前沿位置识别方法的统一框架,即代码库,以及它们的实验评估结果可在https://github.com/CV4RA/SOTA-Place-Recognitioner上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Artificial Intelligence Review
Artificial Intelligence Review 工程技术-计算机:人工智能
CiteScore
22.00
自引率
3.30%
发文量
194
审稿时长
5.3 months
期刊介绍: Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信