MLPQ: A Dataset for Path Question Answering over Multilingual Knowledge Graphs

IF 4.3 3区 材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Yiming Tan , Yongrui Chen , Guilin Qi , Weizhuo Li , Meng Wang
{"title":"MLPQ: A Dataset for Path Question Answering over Multilingual Knowledge Graphs","authors":"Yiming Tan ,&nbsp;Yongrui Chen ,&nbsp;Guilin Qi ,&nbsp;Weizhuo Li ,&nbsp;Meng Wang","doi":"10.1016/j.bdr.2023.100381","DOIUrl":null,"url":null,"abstract":"<div><p>Knowledge Graph-based Multilingual Question Answering (KG-MLQA), as one of the essential subtasks in Knowledge Graph-based Question Answering (KGQA), emphasizes that questions on the KGQA task can be expressed in different languages to solve the lexical gap between questions and knowledge graph(s). However, the existing KG-MLQA works mainly focus on the semantic parsing<span> of multilingual questions but ignore the questions that require integrating information from cross-lingual knowledge graphs (CLKG). This paper extends KG-MLQA to Cross-lingual KG-based multilingual Question Answering (CLKGQA) and constructs the first CLKGQA dataset over multilingual DBpedia named MLPQ, which contains 300K questions in English, Chinese, and French. We further propose a novel KG sampling algorithm<span> for KG construction, making the MLPQ support the research of different types of methods. To evaluate the dataset, we put forward a general question answering workflow whose core idea is to transform CLKGQA into KG-MLQA. We first use the Entity Alignment (EA) model to merge CLKG into a single KG and get the answer to the question by the Multi-hop QA model combined with the Multilingual pre-training model. By instantiating the above QA workflow, we establish two baseline models for MLPQ, one of which uses Google translation to obtain alignment entities, and the other adopts the recent EA model. Experiments show that the baseline models are insufficient to obtain the ideal performances on CLKGQA. Moreover, the availability of our benchmark contributes to the community of question answering and entity alignment.</span></span></p></div>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2023-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S221457962300014X","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 2

Abstract

Knowledge Graph-based Multilingual Question Answering (KG-MLQA), as one of the essential subtasks in Knowledge Graph-based Question Answering (KGQA), emphasizes that questions on the KGQA task can be expressed in different languages to solve the lexical gap between questions and knowledge graph(s). However, the existing KG-MLQA works mainly focus on the semantic parsing of multilingual questions but ignore the questions that require integrating information from cross-lingual knowledge graphs (CLKG). This paper extends KG-MLQA to Cross-lingual KG-based multilingual Question Answering (CLKGQA) and constructs the first CLKGQA dataset over multilingual DBpedia named MLPQ, which contains 300K questions in English, Chinese, and French. We further propose a novel KG sampling algorithm for KG construction, making the MLPQ support the research of different types of methods. To evaluate the dataset, we put forward a general question answering workflow whose core idea is to transform CLKGQA into KG-MLQA. We first use the Entity Alignment (EA) model to merge CLKG into a single KG and get the answer to the question by the Multi-hop QA model combined with the Multilingual pre-training model. By instantiating the above QA workflow, we establish two baseline models for MLPQ, one of which uses Google translation to obtain alignment entities, and the other adopts the recent EA model. Experiments show that the baseline models are insufficient to obtain the ideal performances on CLKGQA. Moreover, the availability of our benchmark contributes to the community of question answering and entity alignment.

MLPQ:一个多语言知识图路径问答数据集
基于知识图的多语言问答(KG-MLQA)作为基于知识图问答(KGQA)的重要子任务之一,强调KGQA任务中的问题可以用不同的语言表达,以解决问题与知识图之间的词汇差距。然而,现有的KG-MLQA工作主要集中在多语言问题的语义解析上,而忽略了需要整合跨语言知识图信息的问题。本文将KG-MLQA扩展到基于跨语言KG的多语言问答(CLKGQA),并在多语言DBpedia上构建了第一个CLKGQA数据集MLPQ,该数据集包含300K个英语、汉语和法语问题。我们进一步提出了一种用于KG构造的新的KG采样算法,使MLPQ支持不同类型方法的研究。为了评估数据集,我们提出了一个通用的问答工作流,其核心思想是将CLKGQA转换为KG-MLQA。我们首先使用实体对齐(EA)模型将CLKG合并为单个KG,并通过多跳QA模型与多语言预训练模型相结合来获得问题的答案。通过实例化上述QA工作流程,我们为MLPQ建立了两个基线模型,其中一个使用谷歌翻译来获得对齐实体,另一个使用最近的EA模型。实验表明,基线模型不足以在CLKGQA上获得理想的性能。此外,我们的基准的可用性有助于问答和实体协调的社区。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.20
自引率
4.30%
发文量
567
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信