Zhongli Fan , Yingdong Pi , Mi Wang , Yifei Kang , Kai Tan
{"title":"GLS-MIFT:从全球到本地搜索的模态不变特征变换","authors":"Zhongli Fan , Yingdong Pi , Mi Wang , Yifei Kang , Kai Tan","doi":"10.1016/j.inffus.2024.102252","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate image matching is the basis for many information fusion-related applications. Conventional methods fail when handling multimodal image pairs with severe nonlinear radiation distortion (NRD) and geometric transformations. To address this problem, we present an effective method, termed modality-invariant feature transform with global-to-local searching (GLS–MIFT). First, we addressed scale changes by constructing the image scale space. Then, we obtained multi-scale, multi-orientation filtering results based on first-order Gaussian steerable filters to exploit their robustness to NRD. Next, we introduced a feature response aggregation model to synthesize the filtering results to generate a feature map, and used it to detect highly repeatable features. Additionally, we designed an adaptive partitioning descriptor to achieve rotation-invariant feature description, involving the following six steps: generation of statistical histograms of multi-orientation filtering values, synthesis of histograms on multiple scales, estimation and updating of the primary direction, determination of the sampling direction, normalization of the feature vector in sub-regions, and finally, obtaining the complete description vector. A convolutional image grouping strategy was used to enhance the rotational invariance of the method. We developed a new feature matcher based on the GLS strategy. Guided by the results of global searching stage, the local searching stage further improved the matching accuracy and reliability of the results. Our experimental results confirmed that GLS–MIFT achieved high-quality matching for a large-scale dataset of 1110 image pairs, with various multimodal image types from the fields of computer vision, medicine, and remote sensing. GLS–MIFT outperformed state-of-the-art methods including SIFT, RIFT, HOWP, OFM, MatchFormer, SemLA and GIFT in qualitative and quantitative evaluations. Our implementation and datasets are available at: <span>https://github.com/Zhongli-Fan/GLS-MIFT</span><svg><path></path></svg>.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":null,"pages":null},"PeriodicalIF":14.7000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GLS–MIFT: A modality invariant feature transform with global-to-local searching\",\"authors\":\"Zhongli Fan , Yingdong Pi , Mi Wang , Yifei Kang , Kai Tan\",\"doi\":\"10.1016/j.inffus.2024.102252\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Accurate image matching is the basis for many information fusion-related applications. Conventional methods fail when handling multimodal image pairs with severe nonlinear radiation distortion (NRD) and geometric transformations. To address this problem, we present an effective method, termed modality-invariant feature transform with global-to-local searching (GLS–MIFT). First, we addressed scale changes by constructing the image scale space. Then, we obtained multi-scale, multi-orientation filtering results based on first-order Gaussian steerable filters to exploit their robustness to NRD. Next, we introduced a feature response aggregation model to synthesize the filtering results to generate a feature map, and used it to detect highly repeatable features. Additionally, we designed an adaptive partitioning descriptor to achieve rotation-invariant feature description, involving the following six steps: generation of statistical histograms of multi-orientation filtering values, synthesis of histograms on multiple scales, estimation and updating of the primary direction, determination of the sampling direction, normalization of the feature vector in sub-regions, and finally, obtaining the complete description vector. A convolutional image grouping strategy was used to enhance the rotational invariance of the method. We developed a new feature matcher based on the GLS strategy. Guided by the results of global searching stage, the local searching stage further improved the matching accuracy and reliability of the results. Our experimental results confirmed that GLS–MIFT achieved high-quality matching for a large-scale dataset of 1110 image pairs, with various multimodal image types from the fields of computer vision, medicine, and remote sensing. GLS–MIFT outperformed state-of-the-art methods including SIFT, RIFT, HOWP, OFM, MatchFormer, SemLA and GIFT in qualitative and quantitative evaluations. Our implementation and datasets are available at: <span>https://github.com/Zhongli-Fan/GLS-MIFT</span><svg><path></path></svg>.</p></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2024-01-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1566253524000307\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524000307","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
GLS–MIFT: A modality invariant feature transform with global-to-local searching
Accurate image matching is the basis for many information fusion-related applications. Conventional methods fail when handling multimodal image pairs with severe nonlinear radiation distortion (NRD) and geometric transformations. To address this problem, we present an effective method, termed modality-invariant feature transform with global-to-local searching (GLS–MIFT). First, we addressed scale changes by constructing the image scale space. Then, we obtained multi-scale, multi-orientation filtering results based on first-order Gaussian steerable filters to exploit their robustness to NRD. Next, we introduced a feature response aggregation model to synthesize the filtering results to generate a feature map, and used it to detect highly repeatable features. Additionally, we designed an adaptive partitioning descriptor to achieve rotation-invariant feature description, involving the following six steps: generation of statistical histograms of multi-orientation filtering values, synthesis of histograms on multiple scales, estimation and updating of the primary direction, determination of the sampling direction, normalization of the feature vector in sub-regions, and finally, obtaining the complete description vector. A convolutional image grouping strategy was used to enhance the rotational invariance of the method. We developed a new feature matcher based on the GLS strategy. Guided by the results of global searching stage, the local searching stage further improved the matching accuracy and reliability of the results. Our experimental results confirmed that GLS–MIFT achieved high-quality matching for a large-scale dataset of 1110 image pairs, with various multimodal image types from the fields of computer vision, medicine, and remote sensing. GLS–MIFT outperformed state-of-the-art methods including SIFT, RIFT, HOWP, OFM, MatchFormer, SemLA and GIFT in qualitative and quantitative evaluations. Our implementation and datasets are available at: https://github.com/Zhongli-Fan/GLS-MIFT.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.