CRU-Net: An Innovative Network for Building Extraction From Remote Sensing Images Based on Channel Enhancement and Multiscale Spatial Attention With ResNet
IF 1.5 4区 计算机科学Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING
{"title":"CRU-Net: An Innovative Network for Building Extraction From Remote Sensing Images Based on Channel Enhancement and Multiscale Spatial Attention With ResNet","authors":"Zhuozhao Chen, Wenbo Chen, Jiao Zheng, Yuanyuan Ding","doi":"10.1002/cpe.70249","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Building extraction from high-resolution remote sensing images enables systematic quantification of urban form evolution, supports critical decision-making in infrastructure development and land-use optimization, and facilitates disaster resilience and risk assessment. However, the characteristics of urban landscapes, such as high-density building distribution and heterogeneous building geometries, pose significant challenges in achieving pixel-level accuracy. To address these challenges, we proposed an innovative network for building extraction based on channel enhancement and multiscale spatial attention with ResNet (CRU-Net). First, CRU-Net employed U-Net as the core architecture, with ResNet34 as the encoder component. Second, to fully exploit the ability of convolutional neural networks to extract features at multiple scales, a new dilated residual block (DRB) was designed by combining a residual block with dilated convolution. Replacing the residual blocks in ResNet34 with DRB enhances the ability of CRU-Net to extract semantic information at different scales for building extraction. Next, the channel enhancement and multiscale spatial attention (CEMS) module was proposed and added to the skip connection of the network. CEMS is capable of learning more important features both spatially and channel-wise, enhancing the network's feature representation ability. Finally, a joint loss function combining normalized cross-correlation loss and binary cross-entropy loss was introduced to train the network, enabling it to focus on learning both global and local features of the building. The experiments show that CRU-Net achieves high accuracy and intersection over union (IoU) values on the Massachusetts building dataset, Inria aerial image labeling dataset, and WHU building dataset.</p>\n </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 21-22","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70249","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Building extraction from high-resolution remote sensing images enables systematic quantification of urban form evolution, supports critical decision-making in infrastructure development and land-use optimization, and facilitates disaster resilience and risk assessment. However, the characteristics of urban landscapes, such as high-density building distribution and heterogeneous building geometries, pose significant challenges in achieving pixel-level accuracy. To address these challenges, we proposed an innovative network for building extraction based on channel enhancement and multiscale spatial attention with ResNet (CRU-Net). First, CRU-Net employed U-Net as the core architecture, with ResNet34 as the encoder component. Second, to fully exploit the ability of convolutional neural networks to extract features at multiple scales, a new dilated residual block (DRB) was designed by combining a residual block with dilated convolution. Replacing the residual blocks in ResNet34 with DRB enhances the ability of CRU-Net to extract semantic information at different scales for building extraction. Next, the channel enhancement and multiscale spatial attention (CEMS) module was proposed and added to the skip connection of the network. CEMS is capable of learning more important features both spatially and channel-wise, enhancing the network's feature representation ability. Finally, a joint loss function combining normalized cross-correlation loss and binary cross-entropy loss was introduced to train the network, enabling it to focus on learning both global and local features of the building. The experiments show that CRU-Net achieves high accuracy and intersection over union (IoU) values on the Massachusetts building dataset, Inria aerial image labeling dataset, and WHU building dataset.
期刊介绍:
Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of:
Parallel and distributed computing;
High-performance computing;
Computational and data science;
Artificial intelligence and machine learning;
Big data applications, algorithms, and systems;
Network science;
Ontologies and semantics;
Security and privacy;
Cloud/edge/fog computing;
Green computing; and
Quantum computing.