An Efficient 3D Point Cloud-Based Place Recognition Approach for Underground Tunnels Using Convolution and Self-Attention Mechanism

IF 4.2 2区 计算机科学 Q2 ROBOTICS
Tao Ye, Ao Liu, Xiangpeng Yan, Xiangming Yan, Yu Ouyang, Xiangpeng Deng, Xiao Cong, Fan Zhang
{"title":"An Efficient 3D Point Cloud-Based Place Recognition Approach for Underground Tunnels Using Convolution and Self-Attention Mechanism","authors":"Tao Ye,&nbsp;Ao Liu,&nbsp;Xiangpeng Yan,&nbsp;Xiangming Yan,&nbsp;Yu Ouyang,&nbsp;Xiangpeng Deng,&nbsp;Xiao Cong,&nbsp;Fan Zhang","doi":"10.1002/rob.22451","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Existing place recognition methods overly rely on effective geometric features in the data. When directly applied to underground tunnels with repetitive spatial structures and blurry texture features, these methods may result in potential misjudgments, thereby reducing positioning accuracy. Additionally, the substantial computational demands of current methods make it challenging to support real-time feedback of positioning information. To address the challenges mentioned above, we first introduced the Feature Reconstruction Convolution Module, aimed at reconstructing prevalent similar feature patterns in underground tunnels and aggregating discriminative feature descriptors, thereby enhancing environmental discrimination. Subsequently, the Sinusoidal Self-Attention Module was implemented to actively filter local descriptors, allocate weights to different descriptors, and determine the most valuable feature descriptors in the network. Finally, the network was further enhanced with the integration of the Rotation-Equivariant Downsampling Module, designed to expand the receptive field, merge features, and reduce computational complexity. According to experimental results, our algorithm achieves a maximum score of 0.996 on the SubT-Tunnel data set and 0.995 on the KITTI data set. Moreover, the method only consists of 0.78 million parameters, and the computation time for a single point cloud frame is 17.3 ms. These scores surpass the performance of many advanced algorithms, emphasizing the effectiveness of our approach.</p>\n </div>","PeriodicalId":192,"journal":{"name":"Journal of Field Robotics","volume":"42 4","pages":"1537-1549"},"PeriodicalIF":4.2000,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Field Robotics","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/rob.22451","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Existing place recognition methods overly rely on effective geometric features in the data. When directly applied to underground tunnels with repetitive spatial structures and blurry texture features, these methods may result in potential misjudgments, thereby reducing positioning accuracy. Additionally, the substantial computational demands of current methods make it challenging to support real-time feedback of positioning information. To address the challenges mentioned above, we first introduced the Feature Reconstruction Convolution Module, aimed at reconstructing prevalent similar feature patterns in underground tunnels and aggregating discriminative feature descriptors, thereby enhancing environmental discrimination. Subsequently, the Sinusoidal Self-Attention Module was implemented to actively filter local descriptors, allocate weights to different descriptors, and determine the most valuable feature descriptors in the network. Finally, the network was further enhanced with the integration of the Rotation-Equivariant Downsampling Module, designed to expand the receptive field, merge features, and reduce computational complexity. According to experimental results, our algorithm achieves a maximum score of 0.996 on the SubT-Tunnel data set and 0.995 on the KITTI data set. Moreover, the method only consists of 0.78 million parameters, and the computation time for a single point cloud frame is 17.3 ms. These scores surpass the performance of many advanced algorithms, emphasizing the effectiveness of our approach.

基于卷积和自关注机制的地下隧道三维点云位置识别方法
现有的位置识别方法过分依赖于数据中有效的几何特征。当这些方法直接应用于空间结构重复、纹理特征模糊的地下隧道时,可能会产生潜在的误判,从而降低定位精度。此外,现有方法对计算量的需求很大,难以支持定位信息的实时反馈。为了解决上述挑战,我们首先引入了特征重建卷积模块,旨在重建地下隧道中普遍存在的相似特征模式,并聚合判别特征描述符,从而增强环境判别。随后,实现了正弦自关注模块,主动过滤局部描述符,为不同的描述符分配权重,确定网络中最有价值的特征描述符。最后,通过集成旋转等变下采样模块进一步增强了网络,该模块旨在扩展接受域,合并特征并降低计算复杂度。实验结果表明,我们的算法在SubT-Tunnel数据集上的最大得分为0.996,在KITTI数据集上的最大得分为0.995。该方法仅包含78万个参数,单点云帧的计算时间为17.3 ms。这些分数超过了许多先进算法的性能,强调了我们方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Field Robotics
Journal of Field Robotics 工程技术-机器人学
CiteScore
15.00
自引率
3.60%
发文量
80
审稿时长
6 months
期刊介绍: The Journal of Field Robotics seeks to promote scholarly publications dealing with the fundamentals of robotics in unstructured and dynamic environments. The Journal focuses on experimental robotics and encourages publication of work that has both theoretical and practical significance.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信