Jungwon Lee , Seungjun Ahn , Daeho Kim , Dongkyun Kim
{"title":"Performance comparison of retrieval-augmented generation and fine-tuned large language models for construction safety management knowledge retrieval","authors":"Jungwon Lee , Seungjun Ahn , Daeho Kim , Dongkyun Kim","doi":"10.1016/j.autcon.2024.105846","DOIUrl":null,"url":null,"abstract":"<div><div>Construction safety standards are in unstructured formats like text and images, complicating their effective use in daily tasks. This paper compares the performance of Retrieval-Augmented Generation (RAG) and fine-tuned Large Language Model (LLM) for the construction safety knowledge retrieval. The RAG model was created by integrating GPT-4 with a knowledge graph derived from construction safety guidelines, while the fine-tuned LLM was fine-tuned using a question-answering dataset derived from the same guidelines. These models' performance is tested through case studies, using accident synopses as a query to generate preventive measurements. The responses were assessed using metrics, including cosine similarity, Euclidean distance, BLEU, and ROUGE scores. It was found that both models outperformed GPT-4, with the RAG model improving by 21.5 % and the fine-tuned LLM by 26 %. The findings highlight the relative strengths and weaknesses of the RAG and fine-tuned LLM approaches in terms of applicability and reliability for safety management.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"168 ","pages":"Article 105846"},"PeriodicalIF":9.6000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automation in Construction","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092658052400582X","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Construction safety standards are in unstructured formats like text and images, complicating their effective use in daily tasks. This paper compares the performance of Retrieval-Augmented Generation (RAG) and fine-tuned Large Language Model (LLM) for the construction safety knowledge retrieval. The RAG model was created by integrating GPT-4 with a knowledge graph derived from construction safety guidelines, while the fine-tuned LLM was fine-tuned using a question-answering dataset derived from the same guidelines. These models' performance is tested through case studies, using accident synopses as a query to generate preventive measurements. The responses were assessed using metrics, including cosine similarity, Euclidean distance, BLEU, and ROUGE scores. It was found that both models outperformed GPT-4, with the RAG model improving by 21.5 % and the fine-tuned LLM by 26 %. The findings highlight the relative strengths and weaknesses of the RAG and fine-tuned LLM approaches in terms of applicability and reliability for safety management.
期刊介绍:
Automation in Construction is an international journal that focuses on publishing original research papers related to the use of Information Technologies in various aspects of the construction industry. The journal covers topics such as design, engineering, construction technologies, and the maintenance and management of constructed facilities.
The scope of Automation in Construction is extensive and covers all stages of the construction life cycle. This includes initial planning and design, construction of the facility, operation and maintenance, as well as the eventual dismantling and recycling of buildings and engineering structures.