Cross-modal 2D-3D feature matching: simultaneous local feature description and detection across images and point clouds

IF 12.2 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-08-27 DOI:10.1016/j.isprsjprs.2025.08.016

Wei Ma , Yucheng Huang , Shengjun Tang , Xianwei Zheng , Zhen Dong , Liang Ge , Jianping Pan , Qingquan Li , Bing Wang

{"title":"Cross-modal 2D-3D feature matching: simultaneous local feature description and detection across images and point clouds","authors":"Wei Ma , Yucheng Huang , Shengjun Tang , Xianwei Zheng , Zhen Dong , Liang Ge , Jianping Pan , Qingquan Li , Bing Wang","doi":"10.1016/j.isprsjprs.2025.08.016","DOIUrl":null,"url":null,"abstract":"<div><div>Establishing correspondences between 2D images and 3D models is essential for precise 3D modeling and accurate positioning. However, widely adopted techniques for aligning 2D images with 3D features heavily depend on dense 3D reconstructions, which not only incur significant computational demands but also tend to exhibit reduced accuracy in texture-poor environments. In this study, we propose a novel method that combines local feature description and detection to enable direct and automatic alignment of 2D images with 3D models. Our approach utilizes a twin convolutional network architecture to process images and 3D data, generating respective feature maps. To address the non-uniform distribution of pixel and spatial point densities, we introduce an ultra-wide perception mechanism to expand the receptive field of image convolution kernels. Next, we apply a non-local maximum suppression criterion to concurrently evaluate the salience of pixels and 3D points. Additionally, we design an adaptive weight optimization loss function that dynamically guides learning objectives toward sample similarity. We rigorously validate our approach on multiple datasets, and our findings demonstrate successful co-extraction of cross-modal feature points. Through comprehensive 2D-3D feature matching experiments, we benchmark our method against several state-of-the-art techniques from recent literature. The results show that our method outperforms nearly all evaluated metrics, underscoring its effectiveness.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"229 ","pages":"Pages 155-169"},"PeriodicalIF":12.2000,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924271625003272","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Establishing correspondences between 2D images and 3D models is essential for precise 3D modeling and accurate positioning. However, widely adopted techniques for aligning 2D images with 3D features heavily depend on dense 3D reconstructions, which not only incur significant computational demands but also tend to exhibit reduced accuracy in texture-poor environments. In this study, we propose a novel method that combines local feature description and detection to enable direct and automatic alignment of 2D images with 3D models. Our approach utilizes a twin convolutional network architecture to process images and 3D data, generating respective feature maps. To address the non-uniform distribution of pixel and spatial point densities, we introduce an ultra-wide perception mechanism to expand the receptive field of image convolution kernels. Next, we apply a non-local maximum suppression criterion to concurrently evaluate the salience of pixels and 3D points. Additionally, we design an adaptive weight optimization loss function that dynamically guides learning objectives toward sample similarity. We rigorously validate our approach on multiple datasets, and our findings demonstrate successful co-extraction of cross-modal feature points. Through comprehensive 2D-3D feature matching experiments, we benchmark our method against several state-of-the-art techniques from recent literature. The results show that our method outperforms nearly all evaluated metrics, underscoring its effectiveness.

查看原文本刊更多论文

跨模态2D-3D特征匹配：跨图像和点云同时进行局部特征描述和检测

建立二维图像和三维模型之间的对应关系对于精确的三维建模和精确定位至关重要。然而，广泛采用的将2D图像与3D特征对齐的技术严重依赖于密集的3D重建，这不仅会产生大量的计算需求，而且在纹理差的环境中往往会表现出较低的准确性。在这项研究中，我们提出了一种结合局部特征描述和检测的新方法，以实现2D图像与3D模型的直接和自动对齐。我们的方法利用双卷积网络架构来处理图像和3D数据，生成各自的特征图。为了解决像素和空间点密度分布不均匀的问题，我们引入了一种超宽感知机制来扩展图像卷积核的接受域。其次，我们应用非局部最大抑制准则来同时评估像素和3D点的显著性。此外，我们设计了一个自适应权重优化损失函数，动态地引导学习目标向样本相似性方向发展。我们在多个数据集上严格验证了我们的方法，我们的发现证明了跨模态特征点的成功共提取。通过全面的2D-3D特征匹配实验，我们将我们的方法与最近文献中的几种最先进的技术进行比较。结果表明，我们的方法优于几乎所有评估指标，强调其有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ISPRS Journal of Photogrammetry and Remote Sensing 工程技术-成像科学与照相技术

CiteScore

21.00

自引率

6.30%

发文量

273

审稿时长

40 days

期刊介绍： The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive. P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields. In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.