评估大型语言模型的 SPARQL 功能

arXiv - CS - Information Retrieval Pub Date : 2024-09-09 DOI:arxiv-2409.05925

Lars-Peter Meyer, Johannes Frey, Felix Brei, Natanael Arndt

{"title":"评估大型语言模型的 SPARQL 功能","authors":"Lars-Peter Meyer, Johannes Frey, Felix Brei, Natanael Arndt","doi":"arxiv-2409.05925","DOIUrl":null,"url":null,"abstract":"The integration of Large Language Models (LLMs) with Knowledge Graphs (KGs)\noffers significant synergistic potential for knowledge-driven applications. One\npossible integration is the interpretation and generation of formal languages,\nsuch as those used in the Semantic Web, with SPARQL being a core technology for\naccessing KGs. In this paper, we focus on measuring out-of-the box capabilities\nof LLMs to work with SPARQL and more specifically with SPARQL SELECT queries\napplying a quantitative approach. We implemented various benchmarking tasks in the LLM-KG-Bench framework for\nautomated execution and evaluation with several LLMs. The tasks assess\ncapabilities along the dimensions of syntax, semantic read, semantic create,\nand the role of knowledge graph prompt inclusion. With this new benchmarking tasks, we evaluated a selection of GPT, Gemini,\nand Claude models. Our findings indicate that working with SPARQL SELECT\nqueries is still challenging for LLMs and heavily depends on the specific LLM\nas well as the complexity of the task. While fixing basic syntax errors seems\nto pose no problems for the best of the current LLMs evaluated, creating\nsemantically correct SPARQL SELECT queries is difficult in several cases.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessing SPARQL capabilities of Large Language Models\",\"authors\":\"Lars-Peter Meyer, Johannes Frey, Felix Brei, Natanael Arndt\",\"doi\":\"arxiv-2409.05925\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The integration of Large Language Models (LLMs) with Knowledge Graphs (KGs)\\noffers significant synergistic potential for knowledge-driven applications. One\\npossible integration is the interpretation and generation of formal languages,\\nsuch as those used in the Semantic Web, with SPARQL being a core technology for\\naccessing KGs. In this paper, we focus on measuring out-of-the box capabilities\\nof LLMs to work with SPARQL and more specifically with SPARQL SELECT queries\\napplying a quantitative approach. We implemented various benchmarking tasks in the LLM-KG-Bench framework for\\nautomated execution and evaluation with several LLMs. The tasks assess\\ncapabilities along the dimensions of syntax, semantic read, semantic create,\\nand the role of knowledge graph prompt inclusion. With this new benchmarking tasks, we evaluated a selection of GPT, Gemini,\\nand Claude models. Our findings indicate that working with SPARQL SELECT\\nqueries is still challenging for LLMs and heavily depends on the specific LLM\\nas well as the complexity of the task. While fixing basic syntax errors seems\\nto pose no problems for the best of the current LLMs evaluated, creating\\nsemantically correct SPARQL SELECT queries is difficult in several cases.\",\"PeriodicalId\":501281,\"journal\":{\"name\":\"arXiv - CS - Information Retrieval\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.05925\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05925","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

大型语言模型（LLM）与知识图谱（KG）的集成为知识驱动型应用提供了巨大的协同潜力。一种可能的整合是解释和生成形式语言，如语义网（Semantic Web）中使用的形式语言，而 SPARQL 是访问知识图谱的核心技术。在本文中，我们将重点放在测量 LLM 与 SPARQL（更具体地说是 SPARQL SELECT 查询）协同工作的开箱即用能力上，采用的是一种定量方法。我们在 LLM-KG-Bench 框架中实施了各种基准测试任务，以便对多个 LLM 进行自动执行和评估。这些任务从语法、语义读取、语义创建以及知识图谱提示包含的作用等方面对能力进行评估。通过这项新的基准测试任务，我们对 GPT、Gemini 和 Claude 模型进行了评估。我们的研究结果表明，处理 SPARQL SELECT 查询对于 LLM 来说仍然具有挑战性，这在很大程度上取决于特定的 LLM 以及任务的复杂性。虽然修复基本的语法错误似乎对当前评估的最佳 LLM 不构成问题，但在某些情况下创建实质上正确的 SPARQL SELECT 查询却很困难。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Assessing SPARQL capabilities of Large Language Models

The integration of Large Language Models (LLMs) with Knowledge Graphs (KGs) offers significant synergistic potential for knowledge-driven applications. One possible integration is the interpretation and generation of formal languages, such as those used in the Semantic Web, with SPARQL being a core technology for accessing KGs. In this paper, we focus on measuring out-of-the box capabilities of LLMs to work with SPARQL and more specifically with SPARQL SELECT queries applying a quantitative approach. We implemented various benchmarking tasks in the LLM-KG-Bench framework for automated execution and evaluation with several LLMs. The tasks assess capabilities along the dimensions of syntax, semantic read, semantic create, and the role of knowledge graph prompt inclusion. With this new benchmarking tasks, we evaluated a selection of GPT, Gemini, and Claude models. Our findings indicate that working with SPARQL SELECT queries is still challenging for LLMs and heavily depends on the specific LLM as well as the complexity of the task. While fixing basic syntax errors seems to pose no problems for the best of the current LLMs evaluated, creating semantically correct SPARQL SELECT queries is difficult in several cases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Information Retrieval

自引率

0.00%

发文量