CRFFNet: A cross-view reprojection based feature fusion network for fine-grained building segmentation using satellite-view and street-view data

IF 15.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2025-09-30 DOI:10.1016/j.inffus.2025.103795

Jinhua Yu , Junyan Ye , Yi Lin, Weijia Li

{"title":"CRFFNet: A cross-view reprojection based feature fusion network for fine-grained building segmentation using satellite-view and street-view data","authors":"Jinhua Yu , Junyan Ye , Yi Lin, Weijia Li","doi":"10.1016/j.inffus.2025.103795","DOIUrl":null,"url":null,"abstract":"<div><div>Fine-grained building attribute segmentation is crucial for rapidly acquiring urban geographic information and understanding urban development dynamics. To achieve a comprehensive perception of buildings, fusing cross-view data, which combines the wide coverage of satellite-view imagery with the detailed observations of street-view images, has become increasingly important. However, existing methods still struggle to effectively mitigate feature discrepancies across different views during cross-view fusion. To address this challenge, we propose the CRFFNet, a Cross-view Reprojection-based Feature Fusion Network for fine-grained building attribute segmentation. CRFFNet eliminates the perspective differences between satellite-view (satellite image and map data) and street-view features, enabling high-precision building attribute segmentation. Specifically, we introduce a deformable module to reduce target distortions in panoramic street-view images, and develop an Explicit Geometric Reprojection (EGR) module, which leverages explicit BEV geometric priors to reproject street-view features onto the satellite-view plane without requiring complex parameter inputs or depth information. To support evaluation, we construct two new datasets, Washington and Seattle, which include satellite imagery, map data, and panoramic street-view images, serving as benchmarks for cross-view, fine-grained building attribute segmentation. Extensive experiments conducted on these datasets, as well as on the public OmniCity and Brooklyn datasets, demonstrate that CRFFNet achieves mIoU improvements of 1.02% on Washington, 8.12% on Seattle, 2.29% on OmniCity, and 2.87% on Brooklyn compared to the second-best method. These improvements demonstrate the potential of our CRFFNet for applications involving large-scale multi-source data, contributing to more comprehensive urban analysis and planning.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103795"},"PeriodicalIF":15.5000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525008577","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Fine-grained building attribute segmentation is crucial for rapidly acquiring urban geographic information and understanding urban development dynamics. To achieve a comprehensive perception of buildings, fusing cross-view data, which combines the wide coverage of satellite-view imagery with the detailed observations of street-view images, has become increasingly important. However, existing methods still struggle to effectively mitigate feature discrepancies across different views during cross-view fusion. To address this challenge, we propose the CRFFNet, a Cross-view Reprojection-based Feature Fusion Network for fine-grained building attribute segmentation. CRFFNet eliminates the perspective differences between satellite-view (satellite image and map data) and street-view features, enabling high-precision building attribute segmentation. Specifically, we introduce a deformable module to reduce target distortions in panoramic street-view images, and develop an Explicit Geometric Reprojection (EGR) module, which leverages explicit BEV geometric priors to reproject street-view features onto the satellite-view plane without requiring complex parameter inputs or depth information. To support evaluation, we construct two new datasets, Washington and Seattle, which include satellite imagery, map data, and panoramic street-view images, serving as benchmarks for cross-view, fine-grained building attribute segmentation. Extensive experiments conducted on these datasets, as well as on the public OmniCity and Brooklyn datasets, demonstrate that CRFFNet achieves mIoU improvements of 1.02% on Washington, 8.12% on Seattle, 2.29% on OmniCity, and 2.87% on Brooklyn compared to the second-best method. These improvements demonstrate the potential of our CRFFNet for applications involving large-scale multi-source data, contributing to more comprehensive urban analysis and planning.

查看原文本刊更多论文

CRFFNet：基于交叉视图重投影的特征融合网络，用于使用卫星视图和街景数据进行细粒度建筑物分割

细粒度的建筑属性分割对于快速获取城市地理信息和了解城市发展动态至关重要。为了实现对建筑物的全面感知，融合交叉视图数据，将卫星视图图像的广泛覆盖范围与街景图像的详细观测相结合，变得越来越重要。然而，现有的方法仍然难以有效地缓解跨视图融合过程中不同视图之间的特征差异。为了解决这一挑战，我们提出了CRFFNet，一种基于交叉视图重投影的特征融合网络，用于细粒度建筑属性分割。CRFFNet消除了卫星视图（卫星图像和地图数据）和街景特征之间的视角差异，实现了高精度的建筑物属性分割。具体来说，我们引入了一个可变形模块来减少全景街景图像中的目标畸变，并开发了一个显式几何重投影（EGR）模块，该模块利用显式BEV几何先验将街景特征重投影到卫星视图平面上，而不需要复杂的参数输入或深度信息。为了支持评估，我们构建了两个新的数据集，华盛顿和西雅图，其中包括卫星图像、地图数据和全景街景图像，作为交叉视图、细粒度建筑属性分割的基准。在这些数据集以及公开的OmniCity和Brooklyn数据集上进行的大量实验表明，与次优方法相比，CRFFNet在华盛顿的mIoU提高了1.02%，在西雅图提高了8.12%，在OmniCity上提高了2.29%，在Brooklyn上提高了2.87%。这些改进显示了我们的CRFFNet在涉及大规模多源数据的应用方面的潜力，有助于更全面的城市分析和规划。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.