{"title":"Geo-scenes dissecting urban fabric: Understanding and recognition combining AI, remotely sensed data and multimodal spatial semantics","authors":"Hanqing Bao , Lanyue Zhou , Lukas W. Lehnert","doi":"10.1016/j.isprsjprs.2025.10.011","DOIUrl":null,"url":null,"abstract":"<div><div>Urban fabric represents the intersection of spatial structure and social function. Analyzing its geographic components, functional semantics, and interactive relationships enables a deeper understanding of the formation and evolution of urban geo-scenes. Urban geo-scenes (UGS), as the fundamental units of urban systems, play a vital role in balancing and optimizing spatial layout, while enhancing urban resilience and vitality. Although multimodal spatial data are widely used to describe UGS, conventional approaches that rely solely on visual or social features are insufficient when addressing the complexity of modern urban systems. The spatial relationships and distributional patterns among urban elements are equally crucial for capturing the full semantic structure of urban geo-scenes. In parallel, most deep learning models still face limitations in effectively mining and fusing such diverse information. To address these challenges, we propose a multimodal deep learning framework for UGS recognition. Guided by the concepts of urban fabric and spatial co-location patterns, our method dissects the internal structure of geo-scenes and constructs a bottom-up urban fabric graph model to capture spatial semantics among geographic entities. Specifically, we employ a customized SE-DenseNet branch to extract deep physical and visual features from high-resolution satellite imagery, along with social semantic information from auxiliary data (e.g., POIs, building footprint coverage). A semantic fusion module is further introduced to enable collaborative interaction among multi-modal and multi-scale features. The framework was validated across four Chinese cities with varying sizes, economic levels, and cultural contexts. The proposed method achieved an overall accuracy of approximately 90%, outperforming existing state-of-the-art multimodal approaches. Moreover, ablation studies conducted in three cities of different scales confirm the critical role of urban fabric in UGS recognition. Our results demonstrate that the joint modeling of visual appearance, functional attributes, and spatial semantics offers a novel and more comprehensive understanding of urban geo-scenes.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"230 ","pages":"Pages 716-737"},"PeriodicalIF":12.2000,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924271625003995","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Urban fabric represents the intersection of spatial structure and social function. Analyzing its geographic components, functional semantics, and interactive relationships enables a deeper understanding of the formation and evolution of urban geo-scenes. Urban geo-scenes (UGS), as the fundamental units of urban systems, play a vital role in balancing and optimizing spatial layout, while enhancing urban resilience and vitality. Although multimodal spatial data are widely used to describe UGS, conventional approaches that rely solely on visual or social features are insufficient when addressing the complexity of modern urban systems. The spatial relationships and distributional patterns among urban elements are equally crucial for capturing the full semantic structure of urban geo-scenes. In parallel, most deep learning models still face limitations in effectively mining and fusing such diverse information. To address these challenges, we propose a multimodal deep learning framework for UGS recognition. Guided by the concepts of urban fabric and spatial co-location patterns, our method dissects the internal structure of geo-scenes and constructs a bottom-up urban fabric graph model to capture spatial semantics among geographic entities. Specifically, we employ a customized SE-DenseNet branch to extract deep physical and visual features from high-resolution satellite imagery, along with social semantic information from auxiliary data (e.g., POIs, building footprint coverage). A semantic fusion module is further introduced to enable collaborative interaction among multi-modal and multi-scale features. The framework was validated across four Chinese cities with varying sizes, economic levels, and cultural contexts. The proposed method achieved an overall accuracy of approximately 90%, outperforming existing state-of-the-art multimodal approaches. Moreover, ablation studies conducted in three cities of different scales confirm the critical role of urban fabric in UGS recognition. Our results demonstrate that the joint modeling of visual appearance, functional attributes, and spatial semantics offers a novel and more comprehensive understanding of urban geo-scenes.
期刊介绍:
The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive.
P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields.
In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.