Enhancing semantic search using ontologies: A hybrid information retrieval approach for industrial text

IF 10.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Syed Meesam Raza Naqvi , Mohammad Ghufran , Christophe Varnier , Jean-Marc Nicod , Noureddine Zerhouni
{"title":"Enhancing semantic search using ontologies: A hybrid information retrieval approach for industrial text","authors":"Syed Meesam Raza Naqvi ,&nbsp;Mohammad Ghufran ,&nbsp;Christophe Varnier ,&nbsp;Jean-Marc Nicod ,&nbsp;Noureddine Zerhouni","doi":"10.1016/j.jii.2025.100835","DOIUrl":null,"url":null,"abstract":"<div><div>Despite the increased focus on data in Industry 4.0, textual data has received little attention in the production and engineering management literature. Data sources such as maintenance records and machine documentation usually are not used to help maintenance decision-making. Available studies mainly focus on categorizing maintenance records or extracting meta-data, such as time of failure, maintenance cost, etc. One of the main reasons behind this underutilization is the complexity and unstructured nature of the industrial text. In this study, we propose a novel hybrid information retrieval approach for industrial text using multi-modal learning. Maintenance operators can use the proposed system to query maintenance records and find similar solutions to a given problem. The proposed system utilizes heterogeneous (multi-modal) data, a combination of maintenance records, and machine ontology to enhance semantic search results. We used the state-of-the-art Large Language Models (LLMs); BERT (Bidirectional Encoder Representations from Transformers) for textual similarity. For similarity among ontology labels, we used a modified version of Wu-Palmer’s similarity. A hybrid weighted similarity is proposed, incorporating text and ontology similarities to enhance semantic search results. The proposed approach was validated using an open-source dataset of real maintenance records from excavators collected over ten years from different mining sites. A retrieval comparison using only text and multi-modal data is performed to estimate the proposed system’s effectiveness. Quantitative and qualitative analysis of results indicates a performance improvement of 8% using the proposed hybrid similarity approach compared to only text-based retrieval. To the best of our knowledge, this is the first study to combine LLMs and machine ontology for semantic search in maintenance records.</div></div>","PeriodicalId":55975,"journal":{"name":"Journal of Industrial Information Integration","volume":"45 ","pages":"Article 100835"},"PeriodicalIF":10.4000,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Industrial Information Integration","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452414X25000597","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Despite the increased focus on data in Industry 4.0, textual data has received little attention in the production and engineering management literature. Data sources such as maintenance records and machine documentation usually are not used to help maintenance decision-making. Available studies mainly focus on categorizing maintenance records or extracting meta-data, such as time of failure, maintenance cost, etc. One of the main reasons behind this underutilization is the complexity and unstructured nature of the industrial text. In this study, we propose a novel hybrid information retrieval approach for industrial text using multi-modal learning. Maintenance operators can use the proposed system to query maintenance records and find similar solutions to a given problem. The proposed system utilizes heterogeneous (multi-modal) data, a combination of maintenance records, and machine ontology to enhance semantic search results. We used the state-of-the-art Large Language Models (LLMs); BERT (Bidirectional Encoder Representations from Transformers) for textual similarity. For similarity among ontology labels, we used a modified version of Wu-Palmer’s similarity. A hybrid weighted similarity is proposed, incorporating text and ontology similarities to enhance semantic search results. The proposed approach was validated using an open-source dataset of real maintenance records from excavators collected over ten years from different mining sites. A retrieval comparison using only text and multi-modal data is performed to estimate the proposed system’s effectiveness. Quantitative and qualitative analysis of results indicates a performance improvement of 8% using the proposed hybrid similarity approach compared to only text-based retrieval. To the best of our knowledge, this is the first study to combine LLMs and machine ontology for semantic search in maintenance records.
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Industrial Information Integration
Journal of Industrial Information Integration Decision Sciences-Information Systems and Management
CiteScore
22.30
自引率
13.40%
发文量
100
期刊介绍: The Journal of Industrial Information Integration focuses on the industry's transition towards industrial integration and informatization, covering not only hardware and software but also information integration. It serves as a platform for promoting advances in industrial information integration, addressing challenges, issues, and solutions in an interdisciplinary forum for researchers, practitioners, and policy makers. The Journal of Industrial Information Integration welcomes papers on foundational, technical, and practical aspects of industrial information integration, emphasizing the complex and cross-disciplinary topics that arise in industrial integration. Techniques from mathematical science, computer science, computer engineering, electrical and electronic engineering, manufacturing engineering, and engineering management are crucial in this context.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信