Natural vs programming language in LLM knowledge graph construction

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management Pub Date : 2025-05-09 DOI:10.1016/j.ipm.2025.104195

Paolo Gajo, Alberto Barrón-Cedeño

{"title":"Natural vs programming language in LLM knowledge graph construction","authors":"Paolo Gajo, Alberto Barrón-Cedeño","doi":"10.1016/j.ipm.2025.104195","DOIUrl":null,"url":null,"abstract":"<div><div>Research on knowledge graph construction (KGC) has recently shown great promise also thanks to the adoption of large language models (LLM) for the automatic extraction of structured information from raw text. However, most works rely on commercial, closed-source LLMs, hindering reproducibility and accessibility. We explore KGC with smaller, open-weight LLMs and investigate whether they can be used to improve upon the results obtained by systems leveraging bigger, closed-source models. Specifically, we focus on CodeKGC, a prompting framework based on GPT-3.5. We choose a variety of models either pre-trained primarily on natural language or on code and fine-tune them on three datasets used for information extraction. We fine-tune with prompts formatted either in natural language or as Python-like scripts. In addition, we optionally train the models with prompts including chain-of-thought sections. After fine-tuning, the choice of coding vs natural language prompts has a limited impact on performance, while chain-of-thought training mostly leads to a performance decrease. Moreover, we show that a LLM can be outperformed by much smaller versions on this task, after undergoing the same amount of training. We find that in general the selected lightweight LLMs outperform the much larger CodeKGC by as much as 15–20 absolute F<span><math><msub><mrow></mrow><mrow><mn>1</mn></mrow></msub></math></span> points after fine-tuning. The results show that state-of-the-art KGC systems can be developed using smaller and open-weight models, enhancing research transparency, lowering compute requirements, and decreasing third-party API reliance.</div><div>Code:</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104195"},"PeriodicalIF":7.4000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325001360","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Research on knowledge graph construction (KGC) has recently shown great promise also thanks to the adoption of large language models (LLM) for the automatic extraction of structured information from raw text. However, most works rely on commercial, closed-source LLMs, hindering reproducibility and accessibility. We explore KGC with smaller, open-weight LLMs and investigate whether they can be used to improve upon the results obtained by systems leveraging bigger, closed-source models. Specifically, we focus on CodeKGC, a prompting framework based on GPT-3.5. We choose a variety of models either pre-trained primarily on natural language or on code and fine-tune them on three datasets used for information extraction. We fine-tune with prompts formatted either in natural language or as Python-like scripts. In addition, we optionally train the models with prompts including chain-of-thought sections. After fine-tuning, the choice of coding vs natural language prompts has a limited impact on performance, while chain-of-thought training mostly leads to a performance decrease. Moreover, we show that a LLM can be outperformed by much smaller versions on this task, after undergoing the same amount of training. We find that in general the selected lightweight LLMs outperform the much larger CodeKGC by as much as 15–20 absolute F

_{1}

points after fine-tuning. The results show that state-of-the-art KGC systems can be developed using smaller and open-weight models, enhancing research transparency, lowering compute requirements, and decreasing third-party API reliance.

Code:

查看原文本刊更多论文

法学硕士知识图谱构建中的自然语言与编程语言

由于采用大型语言模型（LLM）从原始文本中自动提取结构化信息，知识图构建（KGC）的研究最近显示出巨大的前景。然而，大多数作品依赖于商业的闭源llm，阻碍了可重复性和可访问性。我们用较小的、开放权重的llm来探索KGC，并研究它们是否可以用来改进利用更大的、闭源模型的系统所获得的结果。具体来说，我们关注CodeKGC，这是一个基于GPT-3.5的提示框架。我们选择了多种模型，主要是在自然语言或代码上进行预训练，并在用于信息提取的三个数据集上对它们进行微调。我们使用自然语言格式或类python脚本格式的提示进行微调。此外，我们可以选择使用包含思维链部分的提示来训练模型。经过微调后，选择编码和自然语言提示对性能的影响有限，而思维链训练主要导致性能下降。此外，我们表明，在经过相同数量的训练后，LLM可以被更小的版本在这个任务上表现得更好。我们发现，经过微调后，通常选择的轻量级llm比更大的CodeKGC的性能高出15-20个绝对F1点。结果表明，最先进的KGC系统可以使用更小的开放重量模型开发，提高研究透明度，降低计算需求，减少对第三方API的依赖。代码:

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Processing & Management 工程技术-计算机：信息系统

CiteScore

17.00

自引率

11.60%

发文量

276

审稿时长

39 days

期刊介绍： Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.