GLS–MIFT: A modality invariant feature transform with global-to-local searching

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2024-01-12 DOI:10.1016/j.inffus.2024.102252

Zhongli Fan , Yingdong Pi , Mi Wang , Yifei Kang , Kai Tan

{"title":"GLS–MIFT: A modality invariant feature transform with global-to-local searching","authors":"Zhongli Fan , Yingdong Pi , Mi Wang , Yifei Kang , Kai Tan","doi":"10.1016/j.inffus.2024.102252","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate image matching is the basis for many information fusion-related applications. Conventional methods fail when handling multimodal image pairs with severe nonlinear radiation distortion (NRD) and geometric transformations. To address this problem, we present an effective method, termed modality-invariant feature transform with global-to-local searching (GLS–MIFT). First, we addressed scale changes by constructing the image scale space. Then, we obtained multi-scale, multi-orientation filtering results based on first-order Gaussian steerable filters to exploit their robustness to NRD. Next, we introduced a feature response aggregation model to synthesize the filtering results to generate a feature map, and used it to detect highly repeatable features. Additionally, we designed an adaptive partitioning descriptor to achieve rotation-invariant feature description, involving the following six steps: generation of statistical histograms of multi-orientation filtering values, synthesis of histograms on multiple scales, estimation and updating of the primary direction, determination of the sampling direction, normalization of the feature vector in sub-regions, and finally, obtaining the complete description vector. A convolutional image grouping strategy was used to enhance the rotational invariance of the method. We developed a new feature matcher based on the GLS strategy. Guided by the results of global searching stage, the local searching stage further improved the matching accuracy and reliability of the results. Our experimental results confirmed that GLS–MIFT achieved high-quality matching for a large-scale dataset of 1110 image pairs, with various multimodal image types from the fields of computer vision, medicine, and remote sensing. GLS–MIFT outperformed state-of-the-art methods including SIFT, RIFT, HOWP, OFM, MatchFormer, SemLA and GIFT in qualitative and quantitative evaluations. Our implementation and datasets are available at: <span>https://github.com/Zhongli-Fan/GLS-MIFT</span><svg><path></path></svg>.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":null,"pages":null},"PeriodicalIF":14.7000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524000307","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate image matching is the basis for many information fusion-related applications. Conventional methods fail when handling multimodal image pairs with severe nonlinear radiation distortion (NRD) and geometric transformations. To address this problem, we present an effective method, termed modality-invariant feature transform with global-to-local searching (GLS–MIFT). First, we addressed scale changes by constructing the image scale space. Then, we obtained multi-scale, multi-orientation filtering results based on first-order Gaussian steerable filters to exploit their robustness to NRD. Next, we introduced a feature response aggregation model to synthesize the filtering results to generate a feature map, and used it to detect highly repeatable features. Additionally, we designed an adaptive partitioning descriptor to achieve rotation-invariant feature description, involving the following six steps: generation of statistical histograms of multi-orientation filtering values, synthesis of histograms on multiple scales, estimation and updating of the primary direction, determination of the sampling direction, normalization of the feature vector in sub-regions, and finally, obtaining the complete description vector. A convolutional image grouping strategy was used to enhance the rotational invariance of the method. We developed a new feature matcher based on the GLS strategy. Guided by the results of global searching stage, the local searching stage further improved the matching accuracy and reliability of the results. Our experimental results confirmed that GLS–MIFT achieved high-quality matching for a large-scale dataset of 1110 image pairs, with various multimodal image types from the fields of computer vision, medicine, and remote sensing. GLS–MIFT outperformed state-of-the-art methods including SIFT, RIFT, HOWP, OFM, MatchFormer, SemLA and GIFT in qualitative and quantitative evaluations. Our implementation and datasets are available at: https://github.com/Zhongli-Fan/GLS-MIFT.

查看原文本刊更多论文

GLS-MIFT：从全球到本地搜索的模态不变特征变换

精确的图像匹配是许多信息融合相关应用的基础。在处理具有严重非线性辐射失真（NRD）和几何变换的多模态图像对时，传统方法会失效。为了解决这个问题，我们提出了一种有效的方法，即具有全局到局部搜索功能的模态不变特征变换（GLS-MIFT）。首先，我们通过构建图像尺度空间来解决尺度变化问题。然后，我们获得了基于一阶高斯可转向滤波器的多尺度、多方向滤波结果，以利用其对 NRD 的鲁棒性。接着，我们引入了一个特征响应聚合模型来综合滤波结果以生成特征图，并利用它来检测高重复性特征。此外，我们还设计了一种自适应分区描述器来实现旋转不变的特征描述，包括以下六个步骤：生成多方向滤波值的统计直方图、合成多个尺度的直方图、估计和更新主方向、确定采样方向、对子区域的特征向量进行归一化处理，最后获得完整的描述向量。为了增强该方法的旋转不变性，我们采用了卷积图像分组策略。我们基于 GLS 策略开发了一种新的特征匹配器。在全局搜索阶段结果的指导下，局部搜索阶段进一步提高了匹配精度和结果的可靠性。我们的实验结果证实，GLS-MIFT 在一个包含 1110 对图像的大规模数据集上实现了高质量的匹配，这些图像来自计算机视觉、医学和遥感领域的各种多模态图像类型。在定性和定量评估中，GLS-MIFT 的表现优于最先进的方法，包括 SIFT、RIFT、HOWP、OFM、MatchFormer、SemLA 和 GIFT。我们的实现方法和数据集可在以下网址获取：https://github.com/Zhongli-Fan/GLS-MIFT。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.