Multi-branch reverse attention semantic segmentation network for building extraction

IF 3.7 3区地球科学 Q2 ENVIRONMENTAL SCIENCES

Egyptian Journal of Remote Sensing and Space Sciences Pub Date : 2023-12-16 DOI:10.1016/j.ejrs.2023.12.003

Wenxiang Jiang , Yan Chen , Xiaofeng Wang , Menglei Kang , Mengyuan Wang , Xuejun Zhang , Lixiang Xu , Cheng Zhang

{"title":"Multi-branch reverse attention semantic segmentation network for building extraction","authors":"Wenxiang Jiang , Yan Chen , Xiaofeng Wang , Menglei Kang , Mengyuan Wang , Xuejun Zhang , Lixiang Xu , Cheng Zhang","doi":"10.1016/j.ejrs.2023.12.003","DOIUrl":null,"url":null,"abstract":"<div><p>Extraction of color and texture features of buildings from high-resolution remote sensing images often encounters the problems of interference of background information and varying target scales. In addition, most of the current attention mechanisms focus on building key feature selection for building extraction optimization, but ignore the influence of the complex background. Hence, we propose incorporating a novel reverse attention module into the network. The innovative module enables the model to selectively extract crucial building features while suppressing the impact of intricate background noise. It mitigates the influence of uniform spectral and structurally similar heterogeneous background targets on building segmentation and extraction. As a result, the overall generalizability of the model is improved. The reverse attention can also emphasize and amplify the specific details pertaining to the boundaries of the target. Furthermore, we couple a new multi-branch convolution block into the encoder, integrating dilated convolutions with multiple dilation rates. Compared to other methods that use only one multi-scale module to extract multi-scale information from high-level features, we use different receptive field convolutions to simultaneously capture multi-scale targets from multi-level features, effectively improving the ability of the model to extract multi-scale building features. The experimental findings demonstrate that our proposed multi-branch reverse attention semantic segmentation network achieves IoU of 90.59% and 81.79% on the well-known WHU building and Inria aerial image datasets, respectively.</p></div>","PeriodicalId":48539,"journal":{"name":"Egyptian Journal of Remote Sensing and Space Sciences","volume":"27 1","pages":"Pages 10-17"},"PeriodicalIF":3.7000,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1110982323001035/pdfft?md5=0f9a312c78c3551ba2cf17857997a7db&pid=1-s2.0-S1110982323001035-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Journal of Remote Sensing and Space Sciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110982323001035","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Extraction of color and texture features of buildings from high-resolution remote sensing images often encounters the problems of interference of background information and varying target scales. In addition, most of the current attention mechanisms focus on building key feature selection for building extraction optimization, but ignore the influence of the complex background. Hence, we propose incorporating a novel reverse attention module into the network. The innovative module enables the model to selectively extract crucial building features while suppressing the impact of intricate background noise. It mitigates the influence of uniform spectral and structurally similar heterogeneous background targets on building segmentation and extraction. As a result, the overall generalizability of the model is improved. The reverse attention can also emphasize and amplify the specific details pertaining to the boundaries of the target. Furthermore, we couple a new multi-branch convolution block into the encoder, integrating dilated convolutions with multiple dilation rates. Compared to other methods that use only one multi-scale module to extract multi-scale information from high-level features, we use different receptive field convolutions to simultaneously capture multi-scale targets from multi-level features, effectively improving the ability of the model to extract multi-scale building features. The experimental findings demonstrate that our proposed multi-branch reverse attention semantic segmentation network achieves IoU of 90.59% and 81.79% on the well-known WHU building and Inria aerial image datasets, respectively.

查看原文本刊更多论文

用于建筑物提取的多分支反向关注语义分割网络

从高分辨率遥感图像中提取建筑物的颜色和纹理特征往往会遇到背景信息干扰和目标尺度变化的问题。此外，目前的注意力机制大多侧重于建筑物关键特征选择，以优化建筑物提取，却忽略了复杂背景的影响。因此，我们建议在网络中加入一个新颖的反向注意力模块。该创新模块可使模型有选择地提取关键建筑特征，同时抑制复杂背景噪声的影响。它减轻了统一光谱和结构相似的异质背景目标对建筑物分割和提取的影响。因此，模型的整体通用性得到了提高。反向关注还能强调和放大与目标边界相关的特定细节。此外，我们还在编码器中加入了新的多分支卷积块，整合了具有多种扩张率的扩张卷积。与其他仅使用一个多尺度模块从高层次特征中提取多尺度信息的方法相比，我们使用不同的感受野卷积来同时从多层次特征中捕捉多尺度目标，从而有效提高了模型提取多尺度建筑特征的能力。实验结果表明，我们提出的多分支反向注意语义分割网络在著名的 WHU 建筑和 Inria 航空图像数据集上的 IoU 分别达到了 90.59% 和 81.79%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Egyptian Journal of Remote Sensing and Space Sciences Multiple-

CiteScore

8.10

自引率

0.00%

发文量

审稿时长

48 weeks

期刊介绍： The Egyptian Journal of Remote Sensing and Space Sciences (EJRS) encompasses a comprehensive range of topics within Remote Sensing, Geographic Information Systems (GIS), planetary geology, and space technology development, including theories, applications, and modeling. EJRS aims to disseminate high-quality, peer-reviewed research focusing on the advancement of remote sensing and GIS technologies and their practical applications for effective planning, sustainable development, and environmental resource conservation. The journal particularly welcomes innovative papers with broad scientific appeal.