MDFA-Net: Multi-Scale Differential Feature Self-Attention Network for Building Change Detection in Remote Sensing Images

IF 4.2 2区 地球科学 Q2 ENVIRONMENTAL SCIENCES
Remote Sensing Pub Date : 2024-09-18 DOI:10.3390/rs16183466
Yuanling Li, Shengyuan Zou, Tianzhong Zhao, Xiaohui Su
{"title":"MDFA-Net: Multi-Scale Differential Feature Self-Attention Network for Building Change Detection in Remote Sensing Images","authors":"Yuanling Li, Shengyuan Zou, Tianzhong Zhao, Xiaohui Su","doi":"10.3390/rs16183466","DOIUrl":null,"url":null,"abstract":"Building change detection (BCD) from remote sensing images is an essential field for urban studies. In this well-developed field, Convolutional Neural Networks (CNNs) and Transformer have been leveraged to empower BCD models in handling multi-scale information. However, it is still challenging to accurately detect subtle changes using current models, which has been the main bottleneck to improving detection accuracy. In this paper, a multi-scale differential feature self-attention network (MDFA-Net) is proposed to effectively integrate CNN and Transformer by balancing the global receptive field from the self-attention mechanism and the local receptive field from convolutions. In MDFA-Net, two innovative modules were designed. Particularly, a hierarchical multi-scale dilated convolution (HMDConv) module was proposed to extract local features with hybrid dilation convolutions, which can ameliorate the effect of CNN’s local bias. In addition, a differential feature self-attention (DFA) module was developed to implement the self-attention mechanism at multi-scale difference feature maps to overcome the problem that local details may be lost in the global receptive field in Transformer. The proposed MDFA-Net achieves state-of-the-art accuracy performance in comparison with related works, e.g., USSFC-Net, in three open datasets: WHU-CD, CDD-CD, and LEVIR-CD. Based on the experimental results, MDFA-Net significantly exceeds other models in F1 score, IoU, and overall accuracy; the F1 score is 93.81%, 95.52%, and 91.21% in WHU-CD, CDD-CD, and LEVIR-CD datasets, respectively. Furthermore, MDFA-Net achieved first or second place in precision and recall in the test in all three datasets, which indicates its better balance in precision and recall than other models. We also found that subtle changes, i.e., small-sized building changes and irregular boundary changes, are better detected thanks to the introduction of HMDConv and DFA. To this end, with its better ability to leverage multi-scale differential information than traditional methods, MDFA-Net provides a novel and effective avenue to integrate CNN and Transformer in BCD. Further studies could focus on improving the model’s insensitivity to hyper-parameters and the model’s generalizability in practical applications.","PeriodicalId":48993,"journal":{"name":"Remote Sensing","volume":null,"pages":null},"PeriodicalIF":4.2000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3390/rs16183466","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Building change detection (BCD) from remote sensing images is an essential field for urban studies. In this well-developed field, Convolutional Neural Networks (CNNs) and Transformer have been leveraged to empower BCD models in handling multi-scale information. However, it is still challenging to accurately detect subtle changes using current models, which has been the main bottleneck to improving detection accuracy. In this paper, a multi-scale differential feature self-attention network (MDFA-Net) is proposed to effectively integrate CNN and Transformer by balancing the global receptive field from the self-attention mechanism and the local receptive field from convolutions. In MDFA-Net, two innovative modules were designed. Particularly, a hierarchical multi-scale dilated convolution (HMDConv) module was proposed to extract local features with hybrid dilation convolutions, which can ameliorate the effect of CNN’s local bias. In addition, a differential feature self-attention (DFA) module was developed to implement the self-attention mechanism at multi-scale difference feature maps to overcome the problem that local details may be lost in the global receptive field in Transformer. The proposed MDFA-Net achieves state-of-the-art accuracy performance in comparison with related works, e.g., USSFC-Net, in three open datasets: WHU-CD, CDD-CD, and LEVIR-CD. Based on the experimental results, MDFA-Net significantly exceeds other models in F1 score, IoU, and overall accuracy; the F1 score is 93.81%, 95.52%, and 91.21% in WHU-CD, CDD-CD, and LEVIR-CD datasets, respectively. Furthermore, MDFA-Net achieved first or second place in precision and recall in the test in all three datasets, which indicates its better balance in precision and recall than other models. We also found that subtle changes, i.e., small-sized building changes and irregular boundary changes, are better detected thanks to the introduction of HMDConv and DFA. To this end, with its better ability to leverage multi-scale differential information than traditional methods, MDFA-Net provides a novel and effective avenue to integrate CNN and Transformer in BCD. Further studies could focus on improving the model’s insensitivity to hyper-parameters and the model’s generalizability in practical applications.
MDFA-Net:用于遥感图像中建筑物变化检测的多尺度差分特征自注意网络
从遥感图像中进行建筑物变化检测(BCD)是城市研究的一个重要领域。在这一发展成熟的领域,卷积神经网络(CNN)和变形器已被用于增强 BCD 模型处理多尺度信息的能力。然而,利用现有模型准确检测细微变化仍具有挑战性,这也是提高检测准确性的主要瓶颈。本文提出了一种多尺度差异特征自注意网络(MDFA-Net),通过平衡自注意机制的全局感受野和卷积的局部感受野,有效地整合了 CNN 和 Transformer。在 MDFA-Net 中,设计了两个创新模块。尤其是分层多尺度扩张卷积(HMDConv)模块,通过混合扩张卷积提取局部特征,从而改善 CNN 的局部偏差效应。此外,还开发了差分特征自注意(DFA)模块,在多尺度差分特征图上实现自注意机制,以克服 Transformer 中局部细节可能在全局感受野中丢失的问题。与 USSFC-Net 等相关研究相比,所提出的 MDFA-Net 在三个开放数据集上达到了最先进的精度性能:WHU-CD、CDD-CD 和 LEVIR-CD。根据实验结果,MDFA-Net 在 F1 分数、IoU 和总体准确率方面都明显超过了其他模型;在 WHU-CD、CDD-CD 和 LEVIR-CD 数据集中,MDFA-Net 的 F1 分数分别为 93.81%、95.52% 和 91.21%。此外,在所有三个数据集的测试中,MDFA-Net 在精确度和召回率方面都取得了第一或第二名的成绩,这表明它在精确度和召回率方面比其他模型更均衡。我们还发现,由于引入了 HMDConv 和 DFA,细微的变化,即小规模的建筑变化和不规则的边界变化,都能得到更好的检测。因此,与传统方法相比,MDFA-Net 能够更好地利用多尺度差异信息,为在 BCD 中集成 CNN 和 Transformer 提供了一种新颖而有效的途径。进一步的研究可以侧重于提高模型对超参数的不敏感性以及模型在实际应用中的通用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Remote Sensing
Remote Sensing REMOTE SENSING-
CiteScore
8.30
自引率
24.00%
发文量
5435
审稿时长
20.66 days
期刊介绍: Remote Sensing (ISSN 2072-4292) publishes regular research papers, reviews, letters and communications covering all aspects of the remote sensing process, from instrument design and signal processing to the retrieval of geophysical parameters and their application in geosciences. Our aim is to encourage scientists to publish experimental, theoretical and computational results in as much detail as possible so that results can be easily reproduced. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信