ACWCD: Utilizing Inherent Transformers Information and Prior Knowledge for Weakly Supervised Change Detection

IF 8.6 1区地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Geoscience and Remote Sensing Pub Date : 2025-01-08 DOI:10.1109/TGRS.2025.3527009

Wenhao Liu;Zhuoyuan Yu;Bin Luo

{"title":"ACWCD: Utilizing Inherent Transformers Information and Prior Knowledge for Weakly Supervised Change Detection","authors":"Wenhao Liu;Zhuoyuan Yu;Bin Luo","doi":"10.1109/TGRS.2025.3527009","DOIUrl":null,"url":null,"abstract":"Change detection (CD) using deep learning is crucial for analyzing changes on the Earth’s surface. Yet, obtaining accurate, extensive pixel-level labels is difficult and time-consuming. Consequently, there is growing interest in weakly supervised CD (WSCD) using image-level labels, praised for its high efficiency in label acquisition. Nonetheless, the lack of adequate supervision leads many existing WSCD methods to adopt intricate processes, neglecting the inherent information present in the networks. To overcome these challenges, we propose ACWCD, an end-to-end encoder-decoder framework based on transformers for WSCD using image-level labels. The proposed framework is primarily designed for vision transformer (ViT)-related backbones. It generates effective pseudo labels by tapping into the localizing prowess of class activation maps (CAMs) and simultaneously utilizes these labels for pixel-level supervision during training. Specifically, ACWCD comprises two pivotal components: the attention refinement (AR) module and the change priori (CP) constraint. By harnessing the inherent multihead self-attention (MHSA) of transformers, the AR module refines CAMs by producing change attention from MHSA, thereby refining the pseudo labels. Furthermore, utilizing prior knowledge, the CP constraint prevents the AR module from processing samples with unchanged image-level labels, thus addressing the issue of AR generating spurious change areas. In addition, an exclusive threshold is assigned to each pair of images to help differentiate pseudo labels. It also imposes penalties based on the proportion of mispredictions using the designed plug-and-play loss function. To validate the performance of ACWCD, experiments are conducted on three high-resolution remote sensing datasets. The outcomes reveal that the proposed framework not only achieves new state-of-the-art (SOTA) performance within the WSCD domain but also exhibits substantial scalability, as it does not involve any complex processes, serving as a useful baseline for future research. The code is available at <uri>https://github.com/WenhaoLiu03/ACWCD</uri>.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-14"},"PeriodicalIF":8.6000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10833791/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Change detection (CD) using deep learning is crucial for analyzing changes on the Earth’s surface. Yet, obtaining accurate, extensive pixel-level labels is difficult and time-consuming. Consequently, there is growing interest in weakly supervised CD (WSCD) using image-level labels, praised for its high efficiency in label acquisition. Nonetheless, the lack of adequate supervision leads many existing WSCD methods to adopt intricate processes, neglecting the inherent information present in the networks. To overcome these challenges, we propose ACWCD, an end-to-end encoder-decoder framework based on transformers for WSCD using image-level labels. The proposed framework is primarily designed for vision transformer (ViT)-related backbones. It generates effective pseudo labels by tapping into the localizing prowess of class activation maps (CAMs) and simultaneously utilizes these labels for pixel-level supervision during training. Specifically, ACWCD comprises two pivotal components: the attention refinement (AR) module and the change priori (CP) constraint. By harnessing the inherent multihead self-attention (MHSA) of transformers, the AR module refines CAMs by producing change attention from MHSA, thereby refining the pseudo labels. Furthermore, utilizing prior knowledge, the CP constraint prevents the AR module from processing samples with unchanged image-level labels, thus addressing the issue of AR generating spurious change areas. In addition, an exclusive threshold is assigned to each pair of images to help differentiate pseudo labels. It also imposes penalties based on the proportion of mispredictions using the designed plug-and-play loss function. To validate the performance of ACWCD, experiments are conducted on three high-resolution remote sensing datasets. The outcomes reveal that the proposed framework not only achieves new state-of-the-art (SOTA) performance within the WSCD domain but also exhibits substantial scalability, as it does not involve any complex processes, serving as a useful baseline for future research. The code is available at https://github.com/WenhaoLiu03/ACWCD.

查看原文本刊更多论文

利用固有变压器信息和先验知识进行弱监督变化检测

使用深度学习的变化检测（CD）对于分析地球表面的变化至关重要。然而，获得准确，广泛的像素级标签是困难和耗时的。因此，人们对使用图像级标签的弱监督CD （WSCD）越来越感兴趣，因为它在标签获取方面效率很高。然而，缺乏足够的监督导致许多现有的WSCD方法采用复杂的过程，忽略了网络中存在的固有信息。为了克服这些挑战，我们提出了ACWCD，这是一个基于图像级标签的WSCD转换器的端到端编码器-解码器框架。该框架主要针对视觉变压器（vision transformer, ViT）相关主干设计。它通过利用类激活图（CAMs）的定位能力生成有效的伪标签，同时在训练期间利用这些标签进行像素级监督。具体而言，ACWCD包括两个关键部分：注意细化（AR）模块和先验变化（CP）约束。通过利用变压器固有的多头自注意（MHSA）， AR模块通过产生来自MHSA的变化注意来改进cam，从而改进伪标签。此外，利用先验知识，CP约束防止AR模块处理具有不变图像级标签的样本，从而解决AR产生虚假变化区域的问题。此外，为每对图像分配一个排他性阈值，以帮助区分伪标签。它还根据使用设计的即插即用损失函数的错误预测比例施加惩罚。为了验证ACWCD的性能，在三个高分辨率遥感数据集上进行了实验。结果表明，所提出的框架不仅在WSCD领域内实现了新的最先进的（SOTA）性能，而且还表现出可观的可扩展性，因为它不涉及任何复杂的过程，可以作为未来研究的有用基线。代码可在https://github.com/WenhaoLiu03/ACWCD上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理

CiteScore

11.50

自引率

28.00%

发文量

1912

审稿时长

4.0 months

期刊介绍： IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.