Few-shot learning with large foundation models for automated segmentation and accessibility analysis in architectural floor plans

Journal of Infrastructure Intelligence and Resilience Pub Date : 2024-12-17 DOI:10.1016/j.iintel.2024.100137

Haolan Zhang, Ruichuan Zhang

{"title":"Few-shot learning with large foundation models for automated segmentation and accessibility analysis in architectural floor plans","authors":"Haolan Zhang, Ruichuan Zhang","doi":"10.1016/j.iintel.2024.100137","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents a novel approach for extracting accessibility features from 2D raster floor plans by integrating few-shot learning techniques with the Segment Anything Model (SAM) and GPT-4. The proposed method addresses the limitations of existing deep learning-based floor plan analysis, which often require extensive annotated datasets and struggle with the variability of raster floor plans. Furthermore, there is a lack of research on extracting accessibility features from 2D raster floor plans, which remain one of the most common formats for storing architectural plans post-design and construction. Our approach, GPT-integrated Multi-object Few-shot SAM (GMFS), leverages similarity maps and cluster-based point sampling to generate accurate visual prompts for SAM, enabling the segmentation of rooms and doors using only five reference samples. The segmented masks are then classified using GPT-4, enhancing the semantic richness of the floor plan analysis. We validated GMFS using the CubiCasa and Rent3D datasets, demonstrating impressive performance in segmentation and classification. A detailed case study further showcased the practical application of our approach in calculating accessible means of egress and wheelchair clear space, which are critical features for accessibility compliance. The results highlight the effectiveness and adaptability of our approach in real-world scenarios, underscoring its potential to improve building accessibility and safety analysis in the architecture, engineering, and construction (AEC) industry.</div></div>","PeriodicalId":100791,"journal":{"name":"Journal of Infrastructure Intelligence and Resilience","volume":"4 2","pages":"Article 100137"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Infrastructure Intelligence and Resilience","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772991524000562","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This paper presents a novel approach for extracting accessibility features from 2D raster floor plans by integrating few-shot learning techniques with the Segment Anything Model (SAM) and GPT-4. The proposed method addresses the limitations of existing deep learning-based floor plan analysis, which often require extensive annotated datasets and struggle with the variability of raster floor plans. Furthermore, there is a lack of research on extracting accessibility features from 2D raster floor plans, which remain one of the most common formats for storing architectural plans post-design and construction. Our approach, GPT-integrated Multi-object Few-shot SAM (GMFS), leverages similarity maps and cluster-based point sampling to generate accurate visual prompts for SAM, enabling the segmentation of rooms and doors using only five reference samples. The segmented masks are then classified using GPT-4, enhancing the semantic richness of the floor plan analysis. We validated GMFS using the CubiCasa and Rent3D datasets, demonstrating impressive performance in segmentation and classification. A detailed case study further showcased the practical application of our approach in calculating accessible means of egress and wheelchair clear space, which are critical features for accessibility compliance. The results highlight the effectiveness and adaptability of our approach in real-world scenarios, underscoring its potential to improve building accessibility and safety analysis in the architecture, engineering, and construction (AEC) industry.

查看原文本刊更多论文

在建筑平面图中进行自动分割和可访问性分析的大型基础模型的少量学习

本文提出了一种结合分段任意模型（SAM）和GPT-4的少镜头学习技术，从二维栅格平面图中提取可达性特征的新方法。提出的方法解决了现有基于深度学习的平面图分析的局限性，这些分析通常需要大量带注释的数据集，并且与栅格平面图的可变性作抗争。此外，从二维栅格平面图中提取可达性特征的研究较少，而二维栅格平面图仍然是建筑平面图后期设计和施工中最常用的存储格式之一。我们的方法，gpt集成的多目标少镜头SAM (GMFS)，利用相似性地图和基于聚类的点采样为SAM生成准确的视觉提示，仅使用五个参考样本就可以分割房间和门。然后使用GPT-4对分割的掩模进行分类，增强了平面图分析的语义丰富性。我们使用CubiCasa和Rent3D数据集验证了GMFS，在分割和分类方面表现出令人印象深刻的性能。一个详细的案例研究进一步展示了我们的方法在计算无障碍出口和轮椅空间方面的实际应用，这些都是符合无障碍标准的关键特征。结果突出了我们的方法在现实场景中的有效性和适应性，强调了它在建筑、工程和施工（AEC）行业中改善建筑物可达性和安全分析的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Infrastructure Intelligence and Resilience

CiteScore

2.10

自引率

0.00%

发文量