Wenqian Dong;Junying Ren;Song Xiao;Leyuan Fang;Jiahui Qu;Yunsong Li
{"title":"Cycle Translation-Based Collaborative Training for Hyperspectral-RGB Multimodal Change Detection","authors":"Wenqian Dong;Junying Ren;Song Xiao;Leyuan Fang;Jiahui Qu;Yunsong Li","doi":"10.1109/TIP.2025.3607609","DOIUrl":null,"url":null,"abstract":"Hyperspectral image change detection (HSI-CD) benefits from HSIs with continuous spectral bands, which uniquely enables the analysis of more subtle changes. Existing methods have achieved desirable performance relying on multi-temporal homogenous HSIs over the same region, which is generally difficult to obtain in real scenes. HSI-RGB multimodal CD overcomes the constraint of limited HSI availability by incorporating another temporal RGB data, and the combination of advantages within different modalities enhances the robustness of detection results. Nevertheless, due to the different imaging mechanisms between two modalities, existing HSI CD methods cannot be directly applied. In this paper, we propose a cycle translation-based collaborative training (co-training) for HSI-RGB multimodal CD, which achieves cross-modal mutual guidance to collaboratively learn complementary difference information from diverse modalities for identifying changes. Specifically, a cross-modal guided CycleGAN-based image translation module is designed to implement bi-directional image translation, which mitigates modal difference and enables the extraction of information related to land cover changes. Then, a spatial-spectral interactive co-training CD module is proposed to achieve iterative interaction between cross-modal information, which jointly extracts the multimodal difference features to generate the final results. The proposed method outperforms several leading CD methods in extensive experiments carried out on both real and synthetic datasets. In addition, a new public HSI-RGB multimodal dataset along with our code are available at <uri>https://github.com/Jiahuiqu/CT2Net</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"6347-6360"},"PeriodicalIF":13.7000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11164958/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Hyperspectral image change detection (HSI-CD) benefits from HSIs with continuous spectral bands, which uniquely enables the analysis of more subtle changes. Existing methods have achieved desirable performance relying on multi-temporal homogenous HSIs over the same region, which is generally difficult to obtain in real scenes. HSI-RGB multimodal CD overcomes the constraint of limited HSI availability by incorporating another temporal RGB data, and the combination of advantages within different modalities enhances the robustness of detection results. Nevertheless, due to the different imaging mechanisms between two modalities, existing HSI CD methods cannot be directly applied. In this paper, we propose a cycle translation-based collaborative training (co-training) for HSI-RGB multimodal CD, which achieves cross-modal mutual guidance to collaboratively learn complementary difference information from diverse modalities for identifying changes. Specifically, a cross-modal guided CycleGAN-based image translation module is designed to implement bi-directional image translation, which mitigates modal difference and enables the extraction of information related to land cover changes. Then, a spatial-spectral interactive co-training CD module is proposed to achieve iterative interaction between cross-modal information, which jointly extracts the multimodal difference features to generate the final results. The proposed method outperforms several leading CD methods in extensive experiments carried out on both real and synthetic datasets. In addition, a new public HSI-RGB multimodal dataset along with our code are available at https://github.com/Jiahuiqu/CT2Net