Large Language Models and Data Quality for Knowledge Graphs

IF 6.9 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management Pub Date : 2025-07-08 DOI:10.1016/j.ipm.2025.104281

Stefano Marchesin , Gianmaria Silvello , Omar Alonso

{"title":"Large Language Models and Data Quality for Knowledge Graphs","authors":"Stefano Marchesin , Gianmaria Silvello , Omar Alonso","doi":"10.1016/j.ipm.2025.104281","DOIUrl":null,"url":null,"abstract":"<div><div>Knowledge Graphs (KGs) have become essential for applications such as virtual assistants, web search, reasoning, and information access and management. Prominent examples include Wikidata, DBpedia, YAGO, and NELL, which large companies widely use for structuring and integrating data. Constructing KGs involves various AI-driven processes, including data integration, entity recognition, relation extraction, and active learning. However, automated methods often lead to sparsity and inaccuracies, making rigorous KG quality evaluation crucial for improving construction methodologies and ensuring reliable downstream applications. Despite its importance, large-scale KG quality assessment remains an underexplored research area.</div><div>The rise of Large Language Models (LLMs) introduces both opportunities and challenges for KG construction and evaluation. LLMs can enhance contextual understanding and reasoning in KG systems but also pose risks, such as introducing misinformation or “hallucinations” that could degrade KG integrity. Effectively integrating LLMs into KG workflows requires robust quality control mechanisms to manage errors and ensure trustworthiness.</div><div>This special issue explores the intersection of KGs and LLMs, emphasizing human–machine collaboration for KG construction and evaluation. We present contributions on LLM-assisted KG generation, large-scale KG quality assessment, and quality control mechanisms for mitigating LLM-induced errors. Topics covered include KG construction methodologies, LLM deployment in KG systems, scalable KG evaluation, human-in-the-loop approaches, domain-specific applications, and industrial KG maintenance. By advancing research in these areas, this issue fosters innovation at the convergence of KGs and LLMs.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 6","pages":"Article 104281"},"PeriodicalIF":6.9000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325002225","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Knowledge Graphs (KGs) have become essential for applications such as virtual assistants, web search, reasoning, and information access and management. Prominent examples include Wikidata, DBpedia, YAGO, and NELL, which large companies widely use for structuring and integrating data. Constructing KGs involves various AI-driven processes, including data integration, entity recognition, relation extraction, and active learning. However, automated methods often lead to sparsity and inaccuracies, making rigorous KG quality evaluation crucial for improving construction methodologies and ensuring reliable downstream applications. Despite its importance, large-scale KG quality assessment remains an underexplored research area.

The rise of Large Language Models (LLMs) introduces both opportunities and challenges for KG construction and evaluation. LLMs can enhance contextual understanding and reasoning in KG systems but also pose risks, such as introducing misinformation or “hallucinations” that could degrade KG integrity. Effectively integrating LLMs into KG workflows requires robust quality control mechanisms to manage errors and ensure trustworthiness.

This special issue explores the intersection of KGs and LLMs, emphasizing human–machine collaboration for KG construction and evaluation. We present contributions on LLM-assisted KG generation, large-scale KG quality assessment, and quality control mechanisms for mitigating LLM-induced errors. Topics covered include KG construction methodologies, LLM deployment in KG systems, scalable KG evaluation, human-in-the-loop approaches, domain-specific applications, and industrial KG maintenance. By advancing research in these areas, this issue fosters innovation at the convergence of KGs and LLMs.

查看原文本刊更多论文

知识图的大型语言模型和数据质量

知识图谱（Knowledge Graphs, KGs）已经成为虚拟助手、网络搜索、推理、信息访问和管理等应用的基础。突出的例子包括Wikidata、DBpedia、YAGO和NELL，大公司广泛使用它们来构建和集成数据。构建KGs涉及多个人工智能驱动的过程，包括数据集成、实体识别、关系提取和主动学习。然而，自动化方法经常导致稀疏性和不准确性，使得严格的KG质量评估对于改进施工方法和确保可靠的下游应用至关重要。尽管其重要性，大规模KG质量评价仍然是一个未充分开发的研究领域。大型语言模型（llm）的兴起为KG的构建和评估带来了机遇和挑战。llm可以增强KG系统中的上下文理解和推理，但也会带来风险，例如引入可能降低KG完整性的错误信息或“幻觉”。有效地将llm集成到KG工作流程中需要强大的质量控制机制来管理错误并确保可靠性。本期特刊探讨了KG和llm的交集，强调了KG构建和评估的人机协作。我们在llm辅助KG生成，大规模KG质量评估和减轻llm引起的错误的质量控制机制方面做出了贡献。涵盖的主题包括KG构建方法、KG系统中的LLM部署、可扩展的KG评估、人在环方法、特定领域的应用和工业KG维护。通过推进这些领域的研究，这个问题促进了KGs和llm的融合创新。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Processing & Management 工程技术-计算机：信息系统

CiteScore

17.00

自引率

11.60%

发文量

276

审稿时长

39 days

期刊介绍： Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.