Mengli Zhou , Peidi Yin , Jinxiaoyu Cui , Huading Lou , Zeyu Yang , Jiang Liu , Changhai Peng
{"title":"From pixels to 3D models: Mask2Former-driven automated reconstruction of Jiangnan traditional villages using remote sensing images","authors":"Mengli Zhou , Peidi Yin , Jinxiaoyu Cui , Huading Lou , Zeyu Yang , Jiang Liu , Changhai Peng","doi":"10.1016/j.jobe.2025.114277","DOIUrl":null,"url":null,"abstract":"<div><div>3D modeling is of great significance to building engineering, visualization and design, but it remains challenges of scalability, cost, and labor-intensive manual processes that are difficult to apply on a large scale in building heritage preservation and rural revitalization. To address these issues, this study proposes a novel deep learning-based approach, which for the first time combines instance segmentation, Mask2Former and Mask R-CNN, with shadow-derived height estimation from remote sensing images to achieve automated 3D reconstruction of Jiangnan traditional villages. Deep learning algorithms, Mask2Former and Mask R-CNN, were used to automatically train and predict the datasets of buildings and shadows. Morphological post-processing was then applied to regularize the extracted binary mask contours of traditional villages, and building heights were estimated through calculated shadow lengths. Finally, validation was conducted through comparisons between deep learning-estimated and measured heights from unmanned aerial vehicle tilt photography across two villages. Results demonstrate that, Mask2Former shows better performance, with accuracy of 88.95 % and precision of 89.46 %. The mean absolute error, root mean square error, and mean absolute percentage error are of all buildings are 0.53 m, 0.92 m, 9.59 %, respectively, confirming the reliability of the proposed approach in estimating building heights. This study provides an automated, efficient, and low-cost technique for 3D modeling in rural buildings, addressing the critical need for scalable traditional villages heritage digitization.</div></div>","PeriodicalId":15064,"journal":{"name":"Journal of building engineering","volume":"114 ","pages":"Article 114277"},"PeriodicalIF":7.4000,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of building engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352710225025148","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
3D modeling is of great significance to building engineering, visualization and design, but it remains challenges of scalability, cost, and labor-intensive manual processes that are difficult to apply on a large scale in building heritage preservation and rural revitalization. To address these issues, this study proposes a novel deep learning-based approach, which for the first time combines instance segmentation, Mask2Former and Mask R-CNN, with shadow-derived height estimation from remote sensing images to achieve automated 3D reconstruction of Jiangnan traditional villages. Deep learning algorithms, Mask2Former and Mask R-CNN, were used to automatically train and predict the datasets of buildings and shadows. Morphological post-processing was then applied to regularize the extracted binary mask contours of traditional villages, and building heights were estimated through calculated shadow lengths. Finally, validation was conducted through comparisons between deep learning-estimated and measured heights from unmanned aerial vehicle tilt photography across two villages. Results demonstrate that, Mask2Former shows better performance, with accuracy of 88.95 % and precision of 89.46 %. The mean absolute error, root mean square error, and mean absolute percentage error are of all buildings are 0.53 m, 0.92 m, 9.59 %, respectively, confirming the reliability of the proposed approach in estimating building heights. This study provides an automated, efficient, and low-cost technique for 3D modeling in rural buildings, addressing the critical need for scalable traditional villages heritage digitization.
期刊介绍:
The Journal of Building Engineering is an interdisciplinary journal that covers all aspects of science and technology concerned with the whole life cycle of the built environment; from the design phase through to construction, operation, performance, maintenance and its deterioration.