Ontology-based prompting with large language models for inferring construction activities from construction images

IF 9.9 1区 工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Cheng Zeng , Timo Hartmann , Leyuan Ma
{"title":"Ontology-based prompting with large language models for inferring construction activities from construction images","authors":"Cheng Zeng ,&nbsp;Timo Hartmann ,&nbsp;Leyuan Ma","doi":"10.1016/j.aei.2025.103869","DOIUrl":null,"url":null,"abstract":"<div><div>Recognizing construction activities from images enhances decision-making by providing context-aware insights into project progress, resource allocation, and productivity. However, conventional approaches, such as supervised learning and knowledge-based approach, struggle to generalize to the dynamic nature of construction sites due to limited annotated data and rigid knowledge patterns. To address these limitations, we propose a novel method that integrates Large Language Models (LLMs) with structured domain knowledge via ontology-based prompting. In our approach, visual features such as entities, spatial arrangements, and actions are mapped to predefined concepts in a construction-specific ontology, resulting in symbolic scene representations. In-context learning is employed by constructing prompts that include multiple structured examples, each describing a scenario with its associated activities. By analyzing these ontology-grounded examples, the LLM learns patterns that connect symbolic representations to construction activity labels, enabling generalization to new, unseen scenes. We evaluated the method using GPT-based models on a dataset covering 29 construction activity types. The model achieved an activity recognition accuracy of 73.68 %, and 50.00 % when jointly identifying the activity and its associated entities. Ablation studies confirmed the positive effects of including Chain-of-Thought reasoning, diverse visual concepts, and richer context examples. These results demonstrate the potential of ontology-informed prompting to support scalable and adaptive visual understanding in construction domains.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"69 ","pages":"Article 103869"},"PeriodicalIF":9.9000,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Engineering Informatics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1474034625007621","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Recognizing construction activities from images enhances decision-making by providing context-aware insights into project progress, resource allocation, and productivity. However, conventional approaches, such as supervised learning and knowledge-based approach, struggle to generalize to the dynamic nature of construction sites due to limited annotated data and rigid knowledge patterns. To address these limitations, we propose a novel method that integrates Large Language Models (LLMs) with structured domain knowledge via ontology-based prompting. In our approach, visual features such as entities, spatial arrangements, and actions are mapped to predefined concepts in a construction-specific ontology, resulting in symbolic scene representations. In-context learning is employed by constructing prompts that include multiple structured examples, each describing a scenario with its associated activities. By analyzing these ontology-grounded examples, the LLM learns patterns that connect symbolic representations to construction activity labels, enabling generalization to new, unseen scenes. We evaluated the method using GPT-based models on a dataset covering 29 construction activity types. The model achieved an activity recognition accuracy of 73.68 %, and 50.00 % when jointly identifying the activity and its associated entities. Ablation studies confirmed the positive effects of including Chain-of-Thought reasoning, diverse visual concepts, and richer context examples. These results demonstrate the potential of ontology-informed prompting to support scalable and adaptive visual understanding in construction domains.
基于本体的提示,使用大型语言模型从构造图像中推断构造活动
通过提供对项目进度、资源分配和生产力的上下文感知的洞察,从图像中识别施工活动可以增强决策。然而,传统的方法,如监督学习和基于知识的方法,由于有限的注释数据和僵化的知识模式,难以推广到建筑工地的动态性质。为了解决这些限制,我们提出了一种新的方法,通过基于本体的提示将大型语言模型(llm)与结构化领域知识集成在一起。在我们的方法中,实体、空间安排和动作等视觉特征被映射到特定于构造的本体中的预定义概念,从而产生符号场景表示。上下文学习是通过构建包含多个结构化示例的提示来实现的,每个示例描述一个场景及其相关活动。通过分析这些以本体为基础的例子,LLM学习将符号表示与建筑活动标签联系起来的模式,从而实现对新的、看不见的场景的概括。我们在包含29种建筑活动类型的数据集上使用基于gpt的模型评估了该方法。该模型对活动的识别准确率为73.68%,对活动及其关联实体的识别准确率为50.00%。消融研究证实了思维链推理、多样化的视觉概念和更丰富的上下文示例的积极作用。这些结果证明了本体通知提示在建筑领域支持可扩展和自适应视觉理解的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Advanced Engineering Informatics
Advanced Engineering Informatics 工程技术-工程:综合
CiteScore
12.40
自引率
18.20%
发文量
292
审稿时长
45 days
期刊介绍: Advanced Engineering Informatics is an international Journal that solicits research papers with an emphasis on 'knowledge' and 'engineering applications'. The Journal seeks original papers that report progress in applying methods of engineering informatics. These papers should have engineering relevance and help provide a scientific base for more reliable, spontaneous, and creative engineering decision-making. Additionally, papers should demonstrate the science of supporting knowledge-intensive engineering tasks and validate the generality, power, and scalability of new methods through rigorous evaluation, preferably both qualitatively and quantitatively. Abstracting and indexing for Advanced Engineering Informatics include Science Citation Index Expanded, Scopus and INSPEC.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信