Distant Supervision for Relation Extraction via Noise Filtering

Jing Chen, Zhiqiang Guo, Jie Yang
{"title":"Distant Supervision for Relation Extraction via Noise Filtering","authors":"Jing Chen, Zhiqiang Guo, Jie Yang","doi":"10.1145/3457682.3457743","DOIUrl":null,"url":null,"abstract":"As a widely used method in relation extraction at the present stage suggests, distant supervision is affected by label noise. The data noise is introduced artificially due to the theory and the performance of distant supervision will be restricted during the modeling process. To solve this problem on the sentence level, the task of relation extraction in our project is modeled with two parts: sentence selector and relation extractor. Sentence selector, based on the theory of reinforcement learning, processes the corpus in units of entity pairs. The training corpus is divided into three parts including selected sentences, discarded sentences, and unlabeled sentences. We try to obtain more semantic information of the training corpus by introducing the intra-class attention and inter-class similarity. To make the operation of filtering noise data more accurate, this model evaluates the predicted value produced by the relation extractor between the selected and discarded sentences in the sentence package. The result shows that the redesigned reinforcement learning algorithm WPR-RL in this study can significantly improve the deficiencies of the existing approach. At the same time, we also carry out a number of composite tests to discuss the impact of each improvement on the performance of the model.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"119 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Machine Learning and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3457682.3457743","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

As a widely used method in relation extraction at the present stage suggests, distant supervision is affected by label noise. The data noise is introduced artificially due to the theory and the performance of distant supervision will be restricted during the modeling process. To solve this problem on the sentence level, the task of relation extraction in our project is modeled with two parts: sentence selector and relation extractor. Sentence selector, based on the theory of reinforcement learning, processes the corpus in units of entity pairs. The training corpus is divided into three parts including selected sentences, discarded sentences, and unlabeled sentences. We try to obtain more semantic information of the training corpus by introducing the intra-class attention and inter-class similarity. To make the operation of filtering noise data more accurate, this model evaluates the predicted value produced by the relation extractor between the selected and discarded sentences in the sentence package. The result shows that the redesigned reinforcement learning algorithm WPR-RL in this study can significantly improve the deficiencies of the existing approach. At the same time, we also carry out a number of composite tests to discuss the impact of each improvement on the performance of the model.
基于噪声滤波的远程监督关系提取
作为现阶段广泛使用的一种关系提取方法,远程监督受到标签噪声的影响。由于理论的原因,人为地引入了数据噪声,在建模过程中会限制远程监督的性能。为了在句子层面上解决这一问题,本课题的关系抽取任务分为句子选择器和关系抽取器两部分进行建模。句子选择器基于强化学习理论,以实体对为单位对语料库进行处理。训练语料库分为三个部分,包括选择句、丢弃句和未标记句。我们通过引入类内关注和类间相似度来获取更多的训练语料库的语义信息。为了使过滤噪声数据的操作更加准确,该模型对句子包中选择的句子和丢弃的句子之间的关系提取器产生的预测值进行评估。结果表明,本研究中重新设计的强化学习算法WPR-RL可以显著改善现有方法的不足。同时,我们还进行了一些复合测试,讨论每次改进对模型性能的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信