面向多模态变化检测的变化掩蔽模态对齐网络

IF 8.6 1区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Fenlong Jiang;Bo Huang;Husheng Wu;Dan Feng;Yu Zhou;Mingyang Zhang;Maoguo Gong;Wei Zhao;Ziyu Guan
{"title":"面向多模态变化检测的变化掩蔽模态对齐网络","authors":"Fenlong Jiang;Bo Huang;Husheng Wu;Dan Feng;Yu Zhou;Mingyang Zhang;Maoguo Gong;Wei Zhao;Ziyu Guan","doi":"10.1109/TGRS.2024.3516001","DOIUrl":null,"url":null,"abstract":"Using multimodal remote sensing images for change detection (CD) can significantly improve the feasibility and reliability in challenging environments. However, the differences in imaging mechanisms make multimodal images highly heterogeneous. A key challenge for multimodal CD (MCD) is that the heterogeneity of the modalities and changes in ground objects are intertwined during processing. To address this issue, this article proposes a change masked modality alignment network (CMMAN), which uses a multitask framework consisting of one CD branch and two image modal transformation (IMT) branches. Specifically, to ensure a unified feature space, bi-temporal multimodal images are first input into the same Swin-Transformer-based encoder. The extracted features are then fed simultaneously into the CD branch and separately into the two IMT branches. In the CD branch, the decoder is also designed based on the Swin-Transformer, and a weakly modality-correlated feature enhancement (WMCFE) module is introduced to mitigate the interference of modality heterogeneity on CD. For the two IMT branches, both employ a generative adversarial network (GAN) to transform between modalities, and the distributions of features from different modalities are aligned through simultaneous optimization. Uniquely, the change probability map predicted by the CD branch is utilized to mask the change regions in IMT, further decoupling ground object changes and modal heterogeneity. Experimental results on multiple public datasets demonstrate that the proposed CMMAN significantly improves MCD performance and shows good compatibility and portability with various common backbone networks.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-16"},"PeriodicalIF":8.6000,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Change Masked Modality Alignment Network for Multimodal Change Detection\",\"authors\":\"Fenlong Jiang;Bo Huang;Husheng Wu;Dan Feng;Yu Zhou;Mingyang Zhang;Maoguo Gong;Wei Zhao;Ziyu Guan\",\"doi\":\"10.1109/TGRS.2024.3516001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Using multimodal remote sensing images for change detection (CD) can significantly improve the feasibility and reliability in challenging environments. However, the differences in imaging mechanisms make multimodal images highly heterogeneous. A key challenge for multimodal CD (MCD) is that the heterogeneity of the modalities and changes in ground objects are intertwined during processing. To address this issue, this article proposes a change masked modality alignment network (CMMAN), which uses a multitask framework consisting of one CD branch and two image modal transformation (IMT) branches. Specifically, to ensure a unified feature space, bi-temporal multimodal images are first input into the same Swin-Transformer-based encoder. The extracted features are then fed simultaneously into the CD branch and separately into the two IMT branches. In the CD branch, the decoder is also designed based on the Swin-Transformer, and a weakly modality-correlated feature enhancement (WMCFE) module is introduced to mitigate the interference of modality heterogeneity on CD. For the two IMT branches, both employ a generative adversarial network (GAN) to transform between modalities, and the distributions of features from different modalities are aligned through simultaneous optimization. Uniquely, the change probability map predicted by the CD branch is utilized to mask the change regions in IMT, further decoupling ground object changes and modal heterogeneity. Experimental results on multiple public datasets demonstrate that the proposed CMMAN significantly improves MCD performance and shows good compatibility and portability with various common backbone networks.\",\"PeriodicalId\":13213,\"journal\":{\"name\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"volume\":\"63 \",\"pages\":\"1-16\"},\"PeriodicalIF\":8.6000,\"publicationDate\":\"2024-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10795250/\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10795250/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

利用多模态遥感图像进行变化检测(CD)可以显著提高在挑战性环境下的可行性和可靠性。然而,成像机制的差异使得多模态图像高度异构。多模态CD (MCD)的一个关键挑战是在处理过程中,模态的异质性和地物的变化是相互交织的。为了解决这个问题,本文提出了一个变化掩模态对齐网络(CMMAN),它使用由一个CD分支和两个图像模态转换(IMT)分支组成的多任务框架。具体来说,为了确保统一的特征空间,双时相多模态图像首先被输入到同一个基于swing - transformer的编码器中。然后将提取的特征同时馈送到CD分支,并分别馈送到两个IMT分支。在CD分支中,也基于swwin - transformer设计了解码器,并引入弱模态相关特征增强(WMCFE)模块来减轻模态异质性对CD的干扰。对于两个IMT分支,都采用生成对抗网络(GAN)在模态之间进行转换,并通过同步优化对齐不同模态的特征分布。独特的是,利用CD分支预测的变化概率图来掩盖IMT中的变化区域,进一步解耦地物变化和模态异质性。在多个公共数据集上的实验结果表明,该算法显著提高了MCD性能,并与多种常用骨干网具有良好的兼容性和可移植性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Change Masked Modality Alignment Network for Multimodal Change Detection
Using multimodal remote sensing images for change detection (CD) can significantly improve the feasibility and reliability in challenging environments. However, the differences in imaging mechanisms make multimodal images highly heterogeneous. A key challenge for multimodal CD (MCD) is that the heterogeneity of the modalities and changes in ground objects are intertwined during processing. To address this issue, this article proposes a change masked modality alignment network (CMMAN), which uses a multitask framework consisting of one CD branch and two image modal transformation (IMT) branches. Specifically, to ensure a unified feature space, bi-temporal multimodal images are first input into the same Swin-Transformer-based encoder. The extracted features are then fed simultaneously into the CD branch and separately into the two IMT branches. In the CD branch, the decoder is also designed based on the Swin-Transformer, and a weakly modality-correlated feature enhancement (WMCFE) module is introduced to mitigate the interference of modality heterogeneity on CD. For the two IMT branches, both employ a generative adversarial network (GAN) to transform between modalities, and the distributions of features from different modalities are aligned through simultaneous optimization. Uniquely, the change probability map predicted by the CD branch is utilized to mask the change regions in IMT, further decoupling ground object changes and modal heterogeneity. Experimental results on multiple public datasets demonstrate that the proposed CMMAN significantly improves MCD performance and shows good compatibility and portability with various common backbone networks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Geoscience and Remote Sensing
IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理
CiteScore
11.50
自引率
28.00%
发文量
1912
审稿时长
4.0 months
期刊介绍: IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信