Noisy Positive-Unlabeled Learning with Self-Training for Speculative Knowledge Graph Reasoning

Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-06-13 DOI:10.48550/arXiv.2306.07512

Ruijie Wang, Baoyu Li, Yichen Lu, Dachun Sun, Jinning Li, Yuchen Yan, Shengzhong Liu, H. Tong, T. Abdelzaher

{"title":"Noisy Positive-Unlabeled Learning with Self-Training for Speculative Knowledge Graph Reasoning","authors":"Ruijie Wang, Baoyu Li, Yichen Lu, Dachun Sun, Jinning Li, Yuchen Yan, Shengzhong Liu, H. Tong, T. Abdelzaher","doi":"10.48550/arXiv.2306.07512","DOIUrl":null,"url":null,"abstract":"This paper studies speculative reasoning task on real-world knowledge graphs (KG) that contain both \\textit{false negative issue} (i.e., potential true facts being excluded) and \\textit{false positive issue} (i.e., unreliable or outdated facts being included). State-of-the-art methods fall short in the speculative reasoning ability, as they assume the correctness of a fact is solely determined by its presence in KG, making them vulnerable to false negative/positive issues. The new reasoning task is formulated as a noisy Positive-Unlabeled learning problem. We propose a variational framework, namely nPUGraph, that jointly estimates the correctness of both collected and uncollected facts (which we call \\textit{label posterior}) and updates model parameters during training. The label posterior estimation facilitates speculative reasoning from two perspectives. First, it improves the robustness of a label posterior-aware graph encoder against false positive links. Second, it identifies missing facts to provide high-quality grounds of reasoning. They are unified in a simple yet effective self-training procedure. Empirically, extensive experiments on three benchmark KG and one Twitter dataset with various degrees of false negative/positive cases demonstrate the effectiveness of nPUGraph.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Meeting of the Association for Computational Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2306.07512","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

This paper studies speculative reasoning task on real-world knowledge graphs (KG) that contain both \textit{false negative issue} (i.e., potential true facts being excluded) and \textit{false positive issue} (i.e., unreliable or outdated facts being included). State-of-the-art methods fall short in the speculative reasoning ability, as they assume the correctness of a fact is solely determined by its presence in KG, making them vulnerable to false negative/positive issues. The new reasoning task is formulated as a noisy Positive-Unlabeled learning problem. We propose a variational framework, namely nPUGraph, that jointly estimates the correctness of both collected and uncollected facts (which we call \textit{label posterior}) and updates model parameters during training. The label posterior estimation facilitates speculative reasoning from two perspectives. First, it improves the robustness of a label posterior-aware graph encoder against false positive links. Second, it identifies missing facts to provide high-quality grounds of reasoning. They are unified in a simple yet effective self-training procedure. Empirically, extensive experiments on three benchmark KG and one Twitter dataset with various degrees of false negative/positive cases demonstrate the effectiveness of nPUGraph.

查看原文本刊更多论文

思辨知识图推理的带自训练的噪声正无标签学习

本文研究了现实世界知识图(KG)上的思测推理任务，其中既包含\textit{假阴性问题}(即排除潜在的真实事实)，也包含\textit{假阳性问题}(即包含不可靠或过时的事实)。最先进的方法缺乏思辨推理能力，因为它们假设事实的正确性完全取决于它在KG中的存在，这使得它们容易受到假阴性/假阳性问题的影响。新的推理任务被表述为一个有噪声的正无标签学习问题。我们提出了一个变分框架，即nPUGraph，它联合估计收集到的和未收集到的事实的正确性(我们称之为\textit{标签后验})，并在训练期间更新模型参数。标签后验估计有助于从两个角度进行推测推理。首先，它提高了标签后验感知图编码器对假阳性链接的鲁棒性。其次，它识别缺失的事实，提供高质量的推理依据。它们统一在一个简单而有效的自我训练程序中。经验上，在三个基准KG和一个Twitter数据集上进行了大量实验，其中包含不同程度的假阴性/阳性案例，证明了nPUGraph的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annual Meeting of the Association for Computational Linguistics

自引率

0.00%

发文量