Yong Wang, Pengbo Zhou, Guohua Geng, Li An, Qi Zhang
{"title":"End-to-end point cloud registration with transformer","authors":"Yong Wang, Pengbo Zhou, Guohua Geng, Li An, Qi Zhang","doi":"10.1007/s10462-024-10985-y","DOIUrl":null,"url":null,"abstract":"<div><p>With the widespread application of large-scale 3D point cloud data in real-world scenarios, efficient and accurate point cloud registration has become a crucial challenge. We propose an end-to-end point cloud registration method based on the Transformer architecture. This method addresses the issues of low overlap and registration in large scenes, exhibiting strong algorithmic versatility and efficiency. We introduce a combination of dynamic position encoding and ternary angular position encoding within the Transformer, effectively enhancing the representation capability of point cloud data and algorithmic generality, thus better tackling point cloud registration challenges in large scenes. Additionally, to enhance the learning capacity of the attention mechanism, we employ an improved cross-attention mechanism that multiplies the softmax with adaptive weights, enabling the model to capture key information within the point cloud more accurately. In the decoding stage, we introduce a multi-scale feature fusion approach that fully exploits the multi-layer information in point cloud data, further improving registration accuracy and robustness. Through the fusion of multi-scale features, we effectively mitigate information loss and handle matching problems between point clouds of varying sizes. Experimental results demonstrate the excellence of our method in addressing low overlap and registration tasks in large scenes, validated across multiple datasets including 3DMatch, ModelNet, KITTI, and MVP-RG.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 1","pages":""},"PeriodicalIF":10.7000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10985-y.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-024-10985-y","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
With the widespread application of large-scale 3D point cloud data in real-world scenarios, efficient and accurate point cloud registration has become a crucial challenge. We propose an end-to-end point cloud registration method based on the Transformer architecture. This method addresses the issues of low overlap and registration in large scenes, exhibiting strong algorithmic versatility and efficiency. We introduce a combination of dynamic position encoding and ternary angular position encoding within the Transformer, effectively enhancing the representation capability of point cloud data and algorithmic generality, thus better tackling point cloud registration challenges in large scenes. Additionally, to enhance the learning capacity of the attention mechanism, we employ an improved cross-attention mechanism that multiplies the softmax with adaptive weights, enabling the model to capture key information within the point cloud more accurately. In the decoding stage, we introduce a multi-scale feature fusion approach that fully exploits the multi-layer information in point cloud data, further improving registration accuracy and robustness. Through the fusion of multi-scale features, we effectively mitigate information loss and handle matching problems between point clouds of varying sizes. Experimental results demonstrate the excellence of our method in addressing low overlap and registration tasks in large scenes, validated across multiple datasets including 3DMatch, ModelNet, KITTI, and MVP-RG.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.