An Efficient 3D Point Cloud-Based Place Recognition Approach for Underground Tunnels Using Convolution and Self-Attention Mechanism

IF 4.2 2区计算机科学 Q2 ROBOTICS

Journal of Field Robotics Pub Date : 2024-11-21 DOI:10.1002/rob.22451

Tao Ye, Ao Liu, Xiangpeng Yan, Xiangming Yan, Yu Ouyang, Xiangpeng Deng, Xiao Cong, Fan Zhang

{"title":"An Efficient 3D Point Cloud-Based Place Recognition Approach for Underground Tunnels Using Convolution and Self-Attention Mechanism","authors":"Tao Ye, Ao Liu, Xiangpeng Yan, Xiangming Yan, Yu Ouyang, Xiangpeng Deng, Xiao Cong, Fan Zhang","doi":"10.1002/rob.22451","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Existing place recognition methods overly rely on effective geometric features in the data. When directly applied to underground tunnels with repetitive spatial structures and blurry texture features, these methods may result in potential misjudgments, thereby reducing positioning accuracy. Additionally, the substantial computational demands of current methods make it challenging to support real-time feedback of positioning information. To address the challenges mentioned above, we first introduced the Feature Reconstruction Convolution Module, aimed at reconstructing prevalent similar feature patterns in underground tunnels and aggregating discriminative feature descriptors, thereby enhancing environmental discrimination. Subsequently, the Sinusoidal Self-Attention Module was implemented to actively filter local descriptors, allocate weights to different descriptors, and determine the most valuable feature descriptors in the network. Finally, the network was further enhanced with the integration of the Rotation-Equivariant Downsampling Module, designed to expand the receptive field, merge features, and reduce computational complexity. According to experimental results, our algorithm achieves a maximum score of 0.996 on the SubT-Tunnel data set and 0.995 on the KITTI data set. Moreover, the method only consists of 0.78 million parameters, and the computation time for a single point cloud frame is 17.3 ms. These scores surpass the performance of many advanced algorithms, emphasizing the effectiveness of our approach.</p>\n </div>","PeriodicalId":192,"journal":{"name":"Journal of Field Robotics","volume":"42 4","pages":"1537-1549"},"PeriodicalIF":4.2000,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Field Robotics","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/rob.22451","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Existing place recognition methods overly rely on effective geometric features in the data. When directly applied to underground tunnels with repetitive spatial structures and blurry texture features, these methods may result in potential misjudgments, thereby reducing positioning accuracy. Additionally, the substantial computational demands of current methods make it challenging to support real-time feedback of positioning information. To address the challenges mentioned above, we first introduced the Feature Reconstruction Convolution Module, aimed at reconstructing prevalent similar feature patterns in underground tunnels and aggregating discriminative feature descriptors, thereby enhancing environmental discrimination. Subsequently, the Sinusoidal Self-Attention Module was implemented to actively filter local descriptors, allocate weights to different descriptors, and determine the most valuable feature descriptors in the network. Finally, the network was further enhanced with the integration of the Rotation-Equivariant Downsampling Module, designed to expand the receptive field, merge features, and reduce computational complexity. According to experimental results, our algorithm achieves a maximum score of 0.996 on the SubT-Tunnel data set and 0.995 on the KITTI data set. Moreover, the method only consists of 0.78 million parameters, and the computation time for a single point cloud frame is 17.3 ms. These scores surpass the performance of many advanced algorithms, emphasizing the effectiveness of our approach.

查看原文本刊更多论文

基于卷积和自关注机制的地下隧道三维点云位置识别方法

现有的位置识别方法过分依赖于数据中有效的几何特征。当这些方法直接应用于空间结构重复、纹理特征模糊的地下隧道时，可能会产生潜在的误判，从而降低定位精度。此外，现有方法对计算量的需求很大，难以支持定位信息的实时反馈。为了解决上述挑战，我们首先引入了特征重建卷积模块，旨在重建地下隧道中普遍存在的相似特征模式，并聚合判别特征描述符，从而增强环境判别。随后，实现了正弦自关注模块，主动过滤局部描述符，为不同的描述符分配权重，确定网络中最有价值的特征描述符。最后，通过集成旋转等变下采样模块进一步增强了网络，该模块旨在扩展接受域，合并特征并降低计算复杂度。实验结果表明，我们的算法在SubT-Tunnel数据集上的最大得分为0.996，在KITTI数据集上的最大得分为0.995。该方法仅包含78万个参数，单点云帧的计算时间为17.3 ms。这些分数超过了许多先进算法的性能，强调了我们方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Field Robotics 工程技术-机器人学

CiteScore

15.00

自引率

3.60%

发文量

审稿时长

6 months

期刊介绍： The Journal of Field Robotics seeks to promote scholarly publications dealing with the fundamentals of robotics in unstructured and dynamic environments. The Journal focuses on experimental robotics and encourages publication of work that has both theoretical and practical significance.