Yi Zhong , Jie Jiang , Weize Quan , Mingyang Zhao , Dong-ming Yan
{"title":"面向遮挡感知立面解析的潜在空间特征学习","authors":"Yi Zhong , Jie Jiang , Weize Quan , Mingyang Zhao , Dong-ming Yan","doi":"10.1016/j.buildenv.2025.112955","DOIUrl":null,"url":null,"abstract":"<div><div>Significant strides in deep learning have propelled facade parsing in computer vision, a process that classifies architectural elements into semantic blocks. A key challenge is handling occlusions in facade images. Current methods struggle due to suboptimal use of multi-scale modules and insufficient differentiation of feature categories in latent space. Addressing this, we propose a novel multi-scale deep learning architecture, enhanced by a distinction loss function, to better capture multi-scale characteristics. This architecture includes a three-stream latent space feature enhancement structure: a main stream for primary processing and two auxiliary streams for feature refinement. Within the architecture, we utilize our designed dual-branch Context Aggregation Module to reconcile discrepancies between global and local features. We propose a distinction loss tailored to our network architecture, guiding the two auxiliary streams to concentrate on specific feature types, thereby enhancing discrimination and reducing confusion. On one hand, the proposed framework for distinct learning in the latent feature space introduces a novel learning paradigm for neural network training, where simple yet effective modifications to the loss function lead to performance optimization. On the other hand, the potential of our method to parse facades under occluded scenarios could be significantly impactful in engineering applications such as urban planning, architectural design, and autonomous driving. Our experiments on benchmark facade datasets demonstrate the superior performance of our approach in handling occlusions and effectively parsing facades, indicating the potential of our method to advance the application of facade parsing in increasingly complex scenarios.</div></div>","PeriodicalId":9273,"journal":{"name":"Building and Environment","volume":"279 ","pages":"Article 112955"},"PeriodicalIF":7.1000,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distinctive learning of latent space feature for occlusion-aware facade parsing\",\"authors\":\"Yi Zhong , Jie Jiang , Weize Quan , Mingyang Zhao , Dong-ming Yan\",\"doi\":\"10.1016/j.buildenv.2025.112955\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Significant strides in deep learning have propelled facade parsing in computer vision, a process that classifies architectural elements into semantic blocks. A key challenge is handling occlusions in facade images. Current methods struggle due to suboptimal use of multi-scale modules and insufficient differentiation of feature categories in latent space. Addressing this, we propose a novel multi-scale deep learning architecture, enhanced by a distinction loss function, to better capture multi-scale characteristics. This architecture includes a three-stream latent space feature enhancement structure: a main stream for primary processing and two auxiliary streams for feature refinement. Within the architecture, we utilize our designed dual-branch Context Aggregation Module to reconcile discrepancies between global and local features. We propose a distinction loss tailored to our network architecture, guiding the two auxiliary streams to concentrate on specific feature types, thereby enhancing discrimination and reducing confusion. On one hand, the proposed framework for distinct learning in the latent feature space introduces a novel learning paradigm for neural network training, where simple yet effective modifications to the loss function lead to performance optimization. On the other hand, the potential of our method to parse facades under occluded scenarios could be significantly impactful in engineering applications such as urban planning, architectural design, and autonomous driving. Our experiments on benchmark facade datasets demonstrate the superior performance of our approach in handling occlusions and effectively parsing facades, indicating the potential of our method to advance the application of facade parsing in increasingly complex scenarios.</div></div>\",\"PeriodicalId\":9273,\"journal\":{\"name\":\"Building and Environment\",\"volume\":\"279 \",\"pages\":\"Article 112955\"},\"PeriodicalIF\":7.1000,\"publicationDate\":\"2025-04-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Building and Environment\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0360132325004378\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CONSTRUCTION & BUILDING TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Building and Environment","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0360132325004378","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}
Distinctive learning of latent space feature for occlusion-aware facade parsing
Significant strides in deep learning have propelled facade parsing in computer vision, a process that classifies architectural elements into semantic blocks. A key challenge is handling occlusions in facade images. Current methods struggle due to suboptimal use of multi-scale modules and insufficient differentiation of feature categories in latent space. Addressing this, we propose a novel multi-scale deep learning architecture, enhanced by a distinction loss function, to better capture multi-scale characteristics. This architecture includes a three-stream latent space feature enhancement structure: a main stream for primary processing and two auxiliary streams for feature refinement. Within the architecture, we utilize our designed dual-branch Context Aggregation Module to reconcile discrepancies between global and local features. We propose a distinction loss tailored to our network architecture, guiding the two auxiliary streams to concentrate on specific feature types, thereby enhancing discrimination and reducing confusion. On one hand, the proposed framework for distinct learning in the latent feature space introduces a novel learning paradigm for neural network training, where simple yet effective modifications to the loss function lead to performance optimization. On the other hand, the potential of our method to parse facades under occluded scenarios could be significantly impactful in engineering applications such as urban planning, architectural design, and autonomous driving. Our experiments on benchmark facade datasets demonstrate the superior performance of our approach in handling occlusions and effectively parsing facades, indicating the potential of our method to advance the application of facade parsing in increasingly complex scenarios.
期刊介绍:
Building and Environment, an international journal, is dedicated to publishing original research papers, comprehensive review articles, editorials, and short communications in the fields of building science, urban physics, and human interaction with the indoor and outdoor built environment. The journal emphasizes innovative technologies and knowledge verified through measurement and analysis. It covers environmental performance across various spatial scales, from cities and communities to buildings and systems, fostering collaborative, multi-disciplinary research with broader significance.