{"title":"Question-answering framework for building codes using fine-tuned and distilled pre-trained transformer models","authors":"Xiaorui Xue , Jiansong Zhang , Yunfeng Chen","doi":"10.1016/j.autcon.2024.105730","DOIUrl":null,"url":null,"abstract":"<div><p>Building code compliance checking is considered a bottleneck in construction projects, which calls for a novel approach to building code query and information retrieval. To address this research gap, the paper presents a question and answering framework comprising: (1) a ‘retriever’ for efficient context retrieval from building codes in response to an inquiry, and (2) a ‘reader’ for precise context interpretation and answer generation. The ‘retriever’, based on the BM25 algorithm, achieved a top-1 precision, recall, and F1-score of 0.95, 0.95, and 0.95, and a top-5 precision, recall, and F1-score of 0.97, 1.00, and 0.99, respectively. The ‘reader’, utilizing the transformer-based <em>“xlm-roberta-base-squad2-distilled”</em> model, achieved a top-4 accuracy of 0.95 and a top-1 F1-score of 0.84. A fine-tuning and model distillation process was used and shown to provide high performance on limited amount of training data, overcoming a common barrier in the development of domain-specific (e.g., construction) deep learning models.</p></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":null,"pages":null},"PeriodicalIF":9.6000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automation in Construction","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0926580524004667","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Building code compliance checking is considered a bottleneck in construction projects, which calls for a novel approach to building code query and information retrieval. To address this research gap, the paper presents a question and answering framework comprising: (1) a ‘retriever’ for efficient context retrieval from building codes in response to an inquiry, and (2) a ‘reader’ for precise context interpretation and answer generation. The ‘retriever’, based on the BM25 algorithm, achieved a top-1 precision, recall, and F1-score of 0.95, 0.95, and 0.95, and a top-5 precision, recall, and F1-score of 0.97, 1.00, and 0.99, respectively. The ‘reader’, utilizing the transformer-based “xlm-roberta-base-squad2-distilled” model, achieved a top-4 accuracy of 0.95 and a top-1 F1-score of 0.84. A fine-tuning and model distillation process was used and shown to provide high performance on limited amount of training data, overcoming a common barrier in the development of domain-specific (e.g., construction) deep learning models.
期刊介绍:
Automation in Construction is an international journal that focuses on publishing original research papers related to the use of Information Technologies in various aspects of the construction industry. The journal covers topics such as design, engineering, construction technologies, and the maintenance and management of constructed facilities.
The scope of Automation in Construction is extensive and covers all stages of the construction life cycle. This includes initial planning and design, construction of the facility, operation and maintenance, as well as the eventual dismantling and recycling of buildings and engineering structures.