RETA: A Schema-Aware, End-to-End Solution for Instance Completion in Knowledge Graphs

Paolo Rosso, Dingqi Yang, Natalia Ostapuk, P. Cudré-Mauroux
{"title":"RETA: A Schema-Aware, End-to-End Solution for Instance Completion in Knowledge Graphs","authors":"Paolo Rosso, Dingqi Yang, Natalia Ostapuk, P. Cudré-Mauroux","doi":"10.1145/3442381.3449883","DOIUrl":null,"url":null,"abstract":"Knowledge Graph (KG) completion has been widely studied to tackle the incompleteness issue (i.e., missing facts) in modern KGs. A fact in a KG is represented as a triplet (h, r, t) linking two entities h and t via a relation r. Existing work mostly consider link prediction to solve this problem, i.e., given two elements of a triplet predicting the missing one, such as (h, r, ?). This task has, however, a strong assumption on the two given elements in a triplet, which have to be correlated, resulting otherwise in meaningless predictions, such as (Marie Curie, headquarters location, ?). In addition, the KG completion problem has also been formulated as a relation prediction task, i.e., when predicting relations r for a given entity h. Without predicting t, this task is however a step away from the ultimate goal of KG completion. Against this background, this paper studies an instance completion task suggesting r-t pairs for a given h, i.e., (h, ?, ?). We propose an end-to-end solution called RETA (as it suggests the Relation and Tail for a given head entity) consisting of two components: a RETA-Filter and RETA-Grader. More precisely, our RETA-Filter first generates candidate r-t pairs for a given h by extracting and leveraging the schema of a KG; our RETA-Grader then evaluates and ranks the candidate r-t pairs considering the plausibility of both the candidate triplet and its corresponding schema using a newly-designed KG embedding model. We evaluate our methods against a sizable collection of state-of-the-art techniques on three real-world KG datasets. Results show that our RETA-Filter generates of high-quality candidate r-t pairs, outperforming the best baseline techniques while reducing by 10.61%-84.75% the candidate size under the same candidate quality guarantees. Moreover, our RETA-Grader also significantly outperforms state-of-the-art link prediction techniques on the instance completion task by 16.25%-65.92% across different datasets.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Web Conference 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3442381.3449883","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Knowledge Graph (KG) completion has been widely studied to tackle the incompleteness issue (i.e., missing facts) in modern KGs. A fact in a KG is represented as a triplet (h, r, t) linking two entities h and t via a relation r. Existing work mostly consider link prediction to solve this problem, i.e., given two elements of a triplet predicting the missing one, such as (h, r, ?). This task has, however, a strong assumption on the two given elements in a triplet, which have to be correlated, resulting otherwise in meaningless predictions, such as (Marie Curie, headquarters location, ?). In addition, the KG completion problem has also been formulated as a relation prediction task, i.e., when predicting relations r for a given entity h. Without predicting t, this task is however a step away from the ultimate goal of KG completion. Against this background, this paper studies an instance completion task suggesting r-t pairs for a given h, i.e., (h, ?, ?). We propose an end-to-end solution called RETA (as it suggests the Relation and Tail for a given head entity) consisting of two components: a RETA-Filter and RETA-Grader. More precisely, our RETA-Filter first generates candidate r-t pairs for a given h by extracting and leveraging the schema of a KG; our RETA-Grader then evaluates and ranks the candidate r-t pairs considering the plausibility of both the candidate triplet and its corresponding schema using a newly-designed KG embedding model. We evaluate our methods against a sizable collection of state-of-the-art techniques on three real-world KG datasets. Results show that our RETA-Filter generates of high-quality candidate r-t pairs, outperforming the best baseline techniques while reducing by 10.61%-84.75% the candidate size under the same candidate quality guarantees. Moreover, our RETA-Grader also significantly outperforms state-of-the-art link prediction techniques on the instance completion task by 16.25%-65.92% across different datasets.
RETA:知识图实例补全的模式感知端到端解决方案
知识图谱(Knowledge Graph, KG)补全已被广泛研究,以解决现代知识图谱中的不完备问题(即缺失事实)。知识图谱中的事实被表示为一个三元组(h, r, t),通过关系r连接两个实体h和t。现有的工作大多考虑链接预测来解决这个问题,即给定三元组中的两个元素预测缺失的元素,如(h, r, ?)。然而,这个任务有一个很强的假设,在一个三元组中两个给定的元素必须是相关的,否则会导致无意义的预测,如(居里夫人,总部的位置,?)此外,KG补全问题也被表述为关系预测任务,即,当预测给定实体h的关系r时,如果不预测t,则该任务离KG补全的最终目标还有一步之遥。在此背景下,本文研究了对给定h (h, ?, ?)提出r-t对的实例补全任务。我们提出了一个端到端解决方案,称为RETA(因为它建议给定头部实体的关系和尾部),由两个组件组成:RETA- filter和RETA- grader。更准确地说,我们的RETA-Filter首先通过提取和利用KG的模式为给定h生成候选r-t对;然后,我们的RETA-Grader使用新设计的KG嵌入模型,考虑候选三元组及其相应模式的可信性,对候选r-t对进行评估和排序。我们在三个真实世界的KG数据集上评估了我们的方法。结果表明,我们的RETA-Filter生成了高质量的候选r-t对,优于最佳基线技术,同时在相同的候选质量保证下减少了10.61%-84.75%的候选大小。此外,在不同的数据集上,我们的RETA-Grader在实例完成任务上的表现也显著优于最先进的链接预测技术,高出16.25%-65.92%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信