NeSF-Net: Building roof and facade segmentation based on neighborhood relationship awareness and scale-frequency modulation network for high-resolution remote sensing images

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-05-26 DOI:10.1016/j.isprsjprs.2025.05.025

Yuan Zhou, Wanshou Jiang, Bin Wang

{"title":"NeSF-Net: Building roof and facade segmentation based on neighborhood relationship awareness and scale-frequency modulation network for high-resolution remote sensing images","authors":"Yuan Zhou, Wanshou Jiang, Bin Wang","doi":"10.1016/j.isprsjprs.2025.05.025","DOIUrl":null,"url":null,"abstract":"<div><div>Building information extraction holds significant application value in smart city development, urban planning, and management. With the accelerating process of urbanization, mid- and high-rise buildings are increasingly prevalent. In orthophotos, the roofs of tall buildings often do not fully overlap with their footprints. In satellite images from oblique angles, buildings may also be obstructed or affected by shadows. Therefore, building information extraction should evolve from a roof-only extraction task to a comprehensive task that includes both roofs and facades. Current methods predominantly employ convolutional neural networks (CNNs) and Transformer models, focusing on describing building boundary and global features. However, these methods have the following limitations: insufficient utilization of information between pixels and limited spatial information recovery capabilities in decoders. This makes it difficult to distinguish between roofs and facades, and the morphological structure of buildings is challenging to maintain. To address these issues, this paper proposes a new network architecture—NeSF-Net, designed to focus on the accurate extraction of roofs and facades. NeSF-Net consists of two core modules: the neighborhood relationship awareness module (NRAM) and the scale-frequency modulation decoder (SFMD). NRAM enhances the connectivity between pixels by constructing sub-neighborhood relationship awareness in the latent space of deep features, effectively improving the integrity of the segmentation results. SFMD significantly reduces the loss of spatial information during the upsampling process by thoroughly extracting and integrating the scale and frequency features of buildings in the decoder. Experiments were conducted on the BANDON dataset, which contains images captured from oblique angles. The proposed method achieved a mIoU of 72.71 % and an F1 score of 83.04 %, outperforming state-of-the-art segmentation methods. The performance in facade extraction was particularly notable, with a mIoU score exceeding the second-best method by 4.92 %. Additionally, generalization experiments were conducted using GaoFen-7 satellite images, taking Shenzhen as a case study. The results demonstrate that the proposed method exhibits good generalization and robustness.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"226 ","pages":"Pages 247-266"},"PeriodicalIF":10.6000,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924271625002126","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Building information extraction holds significant application value in smart city development, urban planning, and management. With the accelerating process of urbanization, mid- and high-rise buildings are increasingly prevalent. In orthophotos, the roofs of tall buildings often do not fully overlap with their footprints. In satellite images from oblique angles, buildings may also be obstructed or affected by shadows. Therefore, building information extraction should evolve from a roof-only extraction task to a comprehensive task that includes both roofs and facades. Current methods predominantly employ convolutional neural networks (CNNs) and Transformer models, focusing on describing building boundary and global features. However, these methods have the following limitations: insufficient utilization of information between pixels and limited spatial information recovery capabilities in decoders. This makes it difficult to distinguish between roofs and facades, and the morphological structure of buildings is challenging to maintain. To address these issues, this paper proposes a new network architecture—NeSF-Net, designed to focus on the accurate extraction of roofs and facades. NeSF-Net consists of two core modules: the neighborhood relationship awareness module (NRAM) and the scale-frequency modulation decoder (SFMD). NRAM enhances the connectivity between pixels by constructing sub-neighborhood relationship awareness in the latent space of deep features, effectively improving the integrity of the segmentation results. SFMD significantly reduces the loss of spatial information during the upsampling process by thoroughly extracting and integrating the scale and frequency features of buildings in the decoder. Experiments were conducted on the BANDON dataset, which contains images captured from oblique angles. The proposed method achieved a mIoU of 72.71 % and an F1 score of 83.04 %, outperforming state-of-the-art segmentation methods. The performance in facade extraction was particularly notable, with a mIoU score exceeding the second-best method by 4.92 %. Additionally, generalization experiments were conducted using GaoFen-7 satellite images, taking Shenzhen as a case study. The results demonstrate that the proposed method exhibits good generalization and robustness.

查看原文本刊更多论文

NeSF-Net：基于邻域关系感知和尺度-频率调制网络的高分辨率遥感影像建筑屋顶和立面分割

建筑信息提取在智慧城市建设、城市规划和管理中具有重要的应用价值。随着城市化进程的加快，中高层建筑越来越多。在正射影片中，高层建筑的屋顶通常不会与它们的脚印完全重叠。在斜角卫星图像中，建筑物也可能被阴影遮挡或影响。因此，建筑信息提取应该从单纯的屋顶提取任务向包括屋顶和立面的综合任务发展。目前的方法主要采用卷积神经网络（cnn）和Transformer模型，重点描述建筑边界和全局特征。然而，这些方法有以下局限性：像素间信息利用率不足，解码器空间信息恢复能力有限。这使得很难区分屋顶和立面，并且建筑物的形态结构难以维护。为了解决这些问题，本文提出了一种新的网络架构——nesf - net，旨在专注于屋顶和立面的准确提取。NeSF-Net由两个核心模块组成：邻域关系感知模块（NRAM）和标频调制解码器（SFMD）。NRAM通过在深度特征的潜在空间中构建子邻域关系感知来增强像素之间的连通性，有效提高分割结果的完整性。SFMD通过在解码器中充分提取和整合建筑物的尺度和频率特征，显著减少了上采样过程中空间信息的丢失。实验在BANDON数据集上进行，该数据集包含从倾斜角度捕获的图像。该方法的mIoU为72.71%，F1分数为83.04%，优于现有的分割方法。该方法在立面提取方面的性能尤为显著，mIoU得分比次优方法高出4.92%。此外，以深圳市为例，利用高分7号卫星影像进行了概化实验。结果表明，该方法具有良好的泛化性和鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ISPRS Journal of Photogrammetry and Remote Sensing 工程技术-成像科学与照相技术

CiteScore

21.00

自引率

6.30%

发文量

273

审稿时长

40 days

期刊介绍： The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive. P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields. In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.