没有人是完美的:在DBpedia知识图上分析问答组件的性能

IF 2.1 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Journal of Web Semantics Pub Date : 2020-12-01 DOI:10.1016/j.websem.2020.100594

Kuldeep Singh , Ioanna Lytra , Arun Sethupat Radhakrishna , Saeedeh Shekarpour , Maria-Esther Vidal , Jens Lehmann

{"title":"没有人是完美的:在DBpedia知识图上分析问答组件的性能","authors":"Kuldeep Singh , Ioanna Lytra , Arun Sethupat Radhakrishna , Saeedeh Shekarpour , Maria-Esther Vidal , Jens Lehmann","doi":"10.1016/j.websem.2020.100594","DOIUrl":null,"url":null,"abstract":"<div><p><span>Question answering (QA) over knowledge graphs has gained significant momentum over the past five years due to the increasing availability of large knowledge graphs and the rising importance of Question Answering for user interaction. Existing QA systems have been extensively evaluated as black boxes and their performance has been characterised in terms of average results over all the questions of </span>benchmarking datasets (i.e. macro evaluation). Albeit informative, macro evaluation studies do not provide evidence about QA components’ strengths and concrete weaknesses. Therefore, the objective of this article is to analyse and micro evaluate available QA components in order to comprehend which question characteristics impact on their performance. For this, we measure at question level and with respect to different question features the accuracy of 29 components reused in QA frameworks for the DBpedia knowledge graph using state-of-the-art benchmarks. As a result, we provide a perspective on collective failure cases, study the similarities and synergies among QA components for different component types and suggest their characteristics preventing them from effectively solving the corresponding QA tasks. Finally, based on these extensive results, we present conclusive insights for future challenges and research directions in the field of Question Answering over knowledge graphs.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.websem.2020.100594","citationCount":"29","resultStr":"{\"title\":\"No one is perfect: Analysing the performance of question answering components over the DBpedia knowledge graph\",\"authors\":\"Kuldeep Singh , Ioanna Lytra , Arun Sethupat Radhakrishna , Saeedeh Shekarpour , Maria-Esther Vidal , Jens Lehmann\",\"doi\":\"10.1016/j.websem.2020.100594\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p><span>Question answering (QA) over knowledge graphs has gained significant momentum over the past five years due to the increasing availability of large knowledge graphs and the rising importance of Question Answering for user interaction. Existing QA systems have been extensively evaluated as black boxes and their performance has been characterised in terms of average results over all the questions of </span>benchmarking datasets (i.e. macro evaluation). Albeit informative, macro evaluation studies do not provide evidence about QA components’ strengths and concrete weaknesses. Therefore, the objective of this article is to analyse and micro evaluate available QA components in order to comprehend which question characteristics impact on their performance. For this, we measure at question level and with respect to different question features the accuracy of 29 components reused in QA frameworks for the DBpedia knowledge graph using state-of-the-art benchmarks. As a result, we provide a perspective on collective failure cases, study the similarities and synergies among QA components for different component types and suggest their characteristics preventing them from effectively solving the corresponding QA tasks. Finally, based on these extensive results, we present conclusive insights for future challenges and research directions in the field of Question Answering over knowledge graphs.</p></div>\",\"PeriodicalId\":49951,\"journal\":{\"name\":\"Journal of Web Semantics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/j.websem.2020.100594\",\"citationCount\":\"29\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Web Semantics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1570826820300342\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Web Semantics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570826820300342","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 29

摘要

在过去的五年中，由于大型知识图谱的可用性不断增加，以及问答对用户交互的重要性不断提高，知识图谱上的问答(QA)获得了显著的发展势头。现有的QA系统已被广泛地评估为黑盒，其性能已被描述为基准测试数据集(即宏观评估)的所有问题的平均结果。尽管提供了信息，但宏观评估研究并没有提供有关QA组成部分的优势和具体弱点的证据。因此，本文的目的是分析和微观评估可用的QA组件，以了解哪些问题特征会影响它们的性能。为此，我们使用最先进的基准，在问题级别和不同的问题特征方面测量DBpedia知识图的QA框架中重用的29个组件的准确性。因此，我们提供了一个集体故障案例的视角，研究了不同组件类型的QA组件之间的相似性和协同性，并提出了它们的特征，阻止它们有效地解决相应的QA任务。最后，基于这些广泛的结果，我们对知识图谱问答领域未来的挑战和研究方向提出了结论性的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

No one is perfect: Analysing the performance of question answering components over the DBpedia knowledge graph

Question answering (QA) over knowledge graphs has gained significant momentum over the past five years due to the increasing availability of large knowledge graphs and the rising importance of Question Answering for user interaction. Existing QA systems have been extensively evaluated as black boxes and their performance has been characterised in terms of average results over all the questions of benchmarking datasets (i.e. macro evaluation). Albeit informative, macro evaluation studies do not provide evidence about QA components’ strengths and concrete weaknesses. Therefore, the objective of this article is to analyse and micro evaluate available QA components in order to comprehend which question characteristics impact on their performance. For this, we measure at question level and with respect to different question features the accuracy of 29 components reused in QA frameworks for the DBpedia knowledge graph using state-of-the-art benchmarks. As a result, we provide a perspective on collective failure cases, study the similarities and synergies among QA components for different component types and suggest their characteristics preventing them from effectively solving the corresponding QA tasks. Finally, based on these extensive results, we present conclusive insights for future challenges and research directions in the field of Question Answering over knowledge graphs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Web Semantics 工程技术-计算机：人工智能

CiteScore

6.20

自引率

12.00%

发文量

审稿时长

14.6 weeks

期刊介绍： The Journal of Web Semantics is an interdisciplinary journal based on research and applications of various subject areas that contribute to the development of a knowledge-intensive and intelligent service Web. These areas include: knowledge technologies, ontology, agents, databases and the semantic grid, obviously disciplines like information retrieval, language technology, human-computer interaction and knowledge discovery are of major relevance as well. All aspects of the Semantic Web development are covered. The publication of large-scale experiments and their analysis is also encouraged to clearly illustrate scenarios and methods that introduce semantics into existing Web interfaces, contents and services. The journal emphasizes the publication of papers that combine theories, methods and experiments from different subject areas in order to deliver innovative semantic methods and applications.