Zinan Wang , Baoguo Liu , Xiaomeng Shi , Zhiyun Deng , Jinglai Sun
{"title":"AI-based hybrid knowledge extraction method in complex engineering scenarios: A case study of drill-and-blast tunnelling excavation","authors":"Zinan Wang , Baoguo Liu , Xiaomeng Shi , Zhiyun Deng , Jinglai Sun","doi":"10.1016/j.cie.2025.111375","DOIUrl":null,"url":null,"abstract":"<div><div>The explosive rise of generative AI models is reshaping both the technological trajectory and developmental landscape of industrial intelligence. However, these large language models demonstrate significant limitations when processing specialized engineering knowledge due to intricate knowledge systems and domain expertise fragmented across various unstructured sources. An AI-based hybrid knowledge extraction method (AHKEM) is proposed to address these challenges in complex engineering scenarios. The method integrates AI techniques and large language models into a systematic framework: TF–IDF analysis is combined with word vector semantics for entity identification across extensive textual corpora, Bert-BiLSTM-CRF is employed for entity recognition, and a novel two-stage hierarchical clustering-GPT relationship mining method (HC-GPT RMM) is utilized for relationship extraction. The approach was demonstrated through a case study of drill-and-blast tunnelling excavation, a typical engineering scenario with complex data characteristics, using a corpus that comprised 13 specifications and standards, 4 professional books, and 343 academic papers, resulting in a knowledge graph containing 1,607 entities and 1,582 relationships that effectively supports various intelligent applications in construction practice. The advantages of AHKEM in handling complex domain knowledge are further validated through comparative experiments with joint extraction approaches. Both a practical framework for knowledge extraction in engineering domains is provided by this study and its application value is demonstrated through a specific construction scenario.</div></div>","PeriodicalId":55220,"journal":{"name":"Computers & Industrial Engineering","volume":"208 ","pages":"Article 111375"},"PeriodicalIF":6.5000,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Industrial Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0360835225005212","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
The explosive rise of generative AI models is reshaping both the technological trajectory and developmental landscape of industrial intelligence. However, these large language models demonstrate significant limitations when processing specialized engineering knowledge due to intricate knowledge systems and domain expertise fragmented across various unstructured sources. An AI-based hybrid knowledge extraction method (AHKEM) is proposed to address these challenges in complex engineering scenarios. The method integrates AI techniques and large language models into a systematic framework: TF–IDF analysis is combined with word vector semantics for entity identification across extensive textual corpora, Bert-BiLSTM-CRF is employed for entity recognition, and a novel two-stage hierarchical clustering-GPT relationship mining method (HC-GPT RMM) is utilized for relationship extraction. The approach was demonstrated through a case study of drill-and-blast tunnelling excavation, a typical engineering scenario with complex data characteristics, using a corpus that comprised 13 specifications and standards, 4 professional books, and 343 academic papers, resulting in a knowledge graph containing 1,607 entities and 1,582 relationships that effectively supports various intelligent applications in construction practice. The advantages of AHKEM in handling complex domain knowledge are further validated through comparative experiments with joint extraction approaches. Both a practical framework for knowledge extraction in engineering domains is provided by this study and its application value is demonstrated through a specific construction scenario.
期刊介绍:
Computers & Industrial Engineering (CAIE) is dedicated to researchers, educators, and practitioners in industrial engineering and related fields. Pioneering the integration of computers in research, education, and practice, industrial engineering has evolved to make computers and electronic communication integral to its domain. CAIE publishes original contributions focusing on the development of novel computerized methodologies to address industrial engineering problems. It also highlights the applications of these methodologies to issues within the broader industrial engineering and associated communities. The journal actively encourages submissions that push the boundaries of fundamental theories and concepts in industrial engineering techniques.