Integrating Linear Skip-Attention With Transformer-Based Network of Multi-Level Features Extraction for Partial Point Cloud Registration

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Image Processing Pub Date : 2025-04-15 DOI:10.1049/ipr2.70055

Qinyu He, Tao Sun

{"title":"Integrating Linear Skip-Attention With Transformer-Based Network of Multi-Level Features Extraction for Partial Point Cloud Registration","authors":"Qinyu He, Tao Sun","doi":"10.1049/ipr2.70055","DOIUrl":null,"url":null,"abstract":"<p>Accurate point correspondences is critical for rigid point cloud registration in correspondence-based methods. Many previous learning-based methods employ encoder-decoder backbone for point feature extraction, while applying attention mechanism for sparse superpoints to deal with the partial overlap situation. However, few of these methods focus on the intermediate layers yet mainly pay attention on the top-most patch features, thus neglecting multi-faceted feature perspectives leading to potential overlap areas estimation inaccuracy. Meanwhile, obtaining correct correspondences is usually interfered with the one-to-many case and outliers. To address these issues, we propose a multi-level features extraction network with integrating linear dual attention mechanism into skip-connection stage of encoder-decoder backbone, both efficiently suppressing irrelevant information and guiding residual features to learn the common regions on which the network should focus to tackle the overlap estimation inaccuracy issue, combined with a parallel-structured decoder forming distinguishable features and potential overlapping regions. Additionally, a two-stage correspondences pruning process is designed to tackle the mismatch issue, which mainly depends on the rigid geometric constraint. Extensive experiments conducted on indoor and outdoor scene datasets demonstrate our method's accuracy and stability, by outperforming state-of-the-art methods on registration recall.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70055","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Image Processing","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70055","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate point correspondences is critical for rigid point cloud registration in correspondence-based methods. Many previous learning-based methods employ encoder-decoder backbone for point feature extraction, while applying attention mechanism for sparse superpoints to deal with the partial overlap situation. However, few of these methods focus on the intermediate layers yet mainly pay attention on the top-most patch features, thus neglecting multi-faceted feature perspectives leading to potential overlap areas estimation inaccuracy. Meanwhile, obtaining correct correspondences is usually interfered with the one-to-many case and outliers. To address these issues, we propose a multi-level features extraction network with integrating linear dual attention mechanism into skip-connection stage of encoder-decoder backbone, both efficiently suppressing irrelevant information and guiding residual features to learn the common regions on which the network should focus to tackle the overlap estimation inaccuracy issue, combined with a parallel-structured decoder forming distinguishable features and potential overlapping regions. Additionally, a two-stage correspondences pruning process is designed to tackle the mismatch issue, which mainly depends on the rigid geometric constraint. Extensive experiments conducted on indoor and outdoor scene datasets demonstrate our method's accuracy and stability, by outperforming state-of-the-art methods on registration recall.

Abstract Image

查看原文本刊更多论文

结合线性跳过注意和基于变压器的多层次特征提取网络的局部点云配准

在基于对应关系的方法中，精确的点对应关系对刚性点云注册至关重要。以往许多基于学习的方法都采用编码器-解码器骨干进行点特征提取，同时应用稀疏超点关注机制来处理部分重叠情况。然而，这些方法很少关注中间层，而主要关注最顶层的斑块特征，从而忽略了多方面的特征视角，导致潜在的重叠区域估计不准确。同时，获取正确的对应关系通常会受到一对多情况和异常值的干扰。针对这些问题，我们提出了一种多层次特征提取网络，在编码器-解码器主干的跳接阶段集成了线性双重关注机制，既能有效抑制无关信息，又能引导残余特征学习网络应关注的共同区域，从而解决重叠估计不准确的问题，并结合并行结构解码器形成可区分特征和潜在重叠区域。此外，还设计了一个两阶段对应关系剪枝过程，以解决不匹配问题，这主要取决于刚性几何约束。在室内和室外场景数据集上进行的大量实验证明了我们方法的准确性和稳定性，在注册召回率方面优于最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IET Image Processing 工程技术-工程：电子与电气

CiteScore

5.40

自引率

8.70%

发文量

282

审稿时长

6 months

期刊介绍： The IET Image Processing journal encompasses research areas related to the generation, processing and communication of visual information. The focus of the journal is the coverage of the latest research results in image and video processing, including image generation and display, enhancement and restoration, segmentation, colour and texture analysis, coding and communication, implementations and architectures as well as innovative applications. Principal topics include: Generation and Display - Imaging sensors and acquisition systems, illumination, sampling and scanning, quantization, colour reproduction, image rendering, display and printing systems, evaluation of image quality. Processing and Analysis - Image enhancement, restoration, segmentation, registration, multispectral, colour and texture processing, multiresolution processing and wavelets, morphological operations, stereoscopic and 3-D processing, motion detection and estimation, video and image sequence processing. Implementations and Architectures - Image and video processing hardware and software, design and construction, architectures and software, neural, adaptive, and fuzzy processing. Coding and Transmission - Image and video compression and coding, compression standards, noise modelling, visual information networks, streamed video. Retrieval and Multimedia - Storage of images and video, database design, image retrieval, video annotation and editing, mixed media incorporating visual information, multimedia systems and applications, image and video watermarking, steganography. Applications - Innovative application of image and video processing technologies to any field, including life sciences, earth sciences, astronomy, document processing and security. Current Special Issue Call for Papers: Evolutionary Computation for Image Processing - https://digital-library.theiet.org/files/IET_IPR_CFP_EC.pdf AI-Powered 3D Vision - https://digital-library.theiet.org/files/IET_IPR_CFP_AIPV.pdf Multidisciplinary advancement of Imaging Technologies: From Medical Diagnostics and Genomics to Cognitive Machine Vision, and Artificial Intelligence - https://digital-library.theiet.org/files/IET_IPR_CFP_IST.pdf Deep Learning for 3D Reconstruction - https://digital-library.theiet.org/files/IET_IPR_CFP_DLR.pdf