{"title":"Multimodal Remote Sensing Image Registration via Modality Perception and Self-Supervised Position Estimation","authors":"Yun Xiao;Chunlei Zhang;Bo Jiang;Yuan Chen;Jin Tang","doi":"10.1109/TGRS.2025.3576290","DOIUrl":null,"url":null,"abstract":"Multimodal remote sensing image registration ensures that images from different sensors or modalities are spatial and informational consistent for effective comparison and analysis. However, due to the nonlinear modality gaps that exist between images, it is difficult to focus solely on spatial positional differences while ignoring the modality gaps. In this article, to address this issue, we propose a new framework for multimodal registration network, named MMRNet. The proposed framework comprises the following main aspects. First, a novel self-supervised positional misalignment estimator (PME) is designed for multimodal image registration. PME can efficiently overcome the modality gaps and learn the positional differences between multimodal images more reliably, optimizing the registration loss by minimizing the positional differences directly. Then, a new paradigm of modality translation, termed modality perception module (MPM), is introduced to effectively learn modality gaps and perform modality translation in the case of positional misalignment. Finally, we further design the modality perception guidance loss to supervise the modality translation task, which can encourage the fidelity of the generated pseudo-modality images. Our registration network integrates both rigid registration model and nonrigid registration model. The experimental results demonstrate that the proposed registration framework can obtain obviously superior performance in both rigid and nonrigid image registration tasks on optical-synthetic aperture radar (SAR) data, optical-map data, and optical-infrared data. The code and relevant dataset will be made publicly available at <uri>https://github.com/Ahuer-Lei/MMRNet</uri>.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-14"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11021679/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Multimodal remote sensing image registration ensures that images from different sensors or modalities are spatial and informational consistent for effective comparison and analysis. However, due to the nonlinear modality gaps that exist between images, it is difficult to focus solely on spatial positional differences while ignoring the modality gaps. In this article, to address this issue, we propose a new framework for multimodal registration network, named MMRNet. The proposed framework comprises the following main aspects. First, a novel self-supervised positional misalignment estimator (PME) is designed for multimodal image registration. PME can efficiently overcome the modality gaps and learn the positional differences between multimodal images more reliably, optimizing the registration loss by minimizing the positional differences directly. Then, a new paradigm of modality translation, termed modality perception module (MPM), is introduced to effectively learn modality gaps and perform modality translation in the case of positional misalignment. Finally, we further design the modality perception guidance loss to supervise the modality translation task, which can encourage the fidelity of the generated pseudo-modality images. Our registration network integrates both rigid registration model and nonrigid registration model. The experimental results demonstrate that the proposed registration framework can obtain obviously superior performance in both rigid and nonrigid image registration tasks on optical-synthetic aperture radar (SAR) data, optical-map data, and optical-infrared data. The code and relevant dataset will be made publicly available at https://github.com/Ahuer-Lei/MMRNet.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.