Joint extraction of entities and relations by entity role recognition

Cognitive Robotics Pub Date : 2022-01-01 DOI:10.1016/j.cogr.2022.11.001

Xi Han, Qi-Ming Liu

{"title":"Joint extraction of entities and relations by entity role recognition","authors":"Xi Han, Qi-Ming Liu","doi":"10.1016/j.cogr.2022.11.001","DOIUrl":null,"url":null,"abstract":"<div><p>Joint extracting entities and relations from unstructured text is a fundamental task in information extraction and a key step in constructing large knowledge graphs, entities and relations are constructed as relational triples of the form (subject, relation, object) or (s, r, o). Although triple extraction has been extremely successful, there are still continuing challenges due to factors such as entity overlap. Recent work has shown us the excellent performance of joint extraction models, however these methods still suffer from some problems, such as the redundancy prediction problem. Traditional methods for solving the overlap problem require triple extraction under the full class of relations defined in the dataset, however the number of relations in a sentence is much smaller than the full relational class, which leads to a large number of redundant predictions. To solve this problem, this paper decomposes the task into two steps: entity and potential relation extraction and entity-semantic role determination of triples. Specifically, we design several modules to extract the entities and relations in the sentence separately, and we use these entities and relations to construct possible candidate triples and predict the semantic roles (subject or object) of the entities under the relational constraints to obtain the correct triples. In general we propose a model for identifying the semantic roles of entities in triples under relation constraints, which can effectively solve the problem of redundant prediction, We also evaluated our model on two widely used public datasets, and our model achieved advanced performance with F1 scores of 90.8 and 92.4 on NYT and WebNLG, respectively.</p></div>","PeriodicalId":100288,"journal":{"name":"Cognitive Robotics","volume":"2 ","pages":"Pages 234-241"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667241322000210/pdfft?md5=52b08deb4b35e7b962f6357768547469&pid=1-s2.0-S2667241322000210-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Robotics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667241322000210","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Joint extracting entities and relations from unstructured text is a fundamental task in information extraction and a key step in constructing large knowledge graphs, entities and relations are constructed as relational triples of the form (subject, relation, object) or (s, r, o). Although triple extraction has been extremely successful, there are still continuing challenges due to factors such as entity overlap. Recent work has shown us the excellent performance of joint extraction models, however these methods still suffer from some problems, such as the redundancy prediction problem. Traditional methods for solving the overlap problem require triple extraction under the full class of relations defined in the dataset, however the number of relations in a sentence is much smaller than the full relational class, which leads to a large number of redundant predictions. To solve this problem, this paper decomposes the task into two steps: entity and potential relation extraction and entity-semantic role determination of triples. Specifically, we design several modules to extract the entities and relations in the sentence separately, and we use these entities and relations to construct possible candidate triples and predict the semantic roles (subject or object) of the entities under the relational constraints to obtain the correct triples. In general we propose a model for identifying the semantic roles of entities in triples under relation constraints, which can effectively solve the problem of redundant prediction, We also evaluated our model on two widely used public datasets, and our model achieved advanced performance with F1 scores of 90.8 and 92.4 on NYT and WebNLG, respectively.

查看原文本刊更多论文

基于实体角色识别的实体和关系的联合抽取

从非结构化文本中联合抽取实体和关系是信息抽取的基本任务，也是构建大型知识图谱的关键步骤，实体和关系被构造为(主体、关系、对象)或(s、r、o)形式的关系三元组。虽然三元组抽取已经非常成功，但由于实体重叠等因素仍然存在挑战。近年来的研究表明，联合抽取模型具有良好的性能，但这些方法仍然存在一些问题，如冗余预测问题。解决重叠问题的传统方法需要在数据集中定义的全类关系下进行三次提取，然而句子中的关系数量远远小于全类关系，这导致了大量的冗余预测。为了解决这一问题，本文将任务分解为两个步骤:实体和潜在关系提取和三元组的实体-语义角色确定。具体来说，我们设计了几个模块分别提取句子中的实体和关系，利用这些实体和关系构造可能的候选三元组，并在关系约束下预测实体的语义角色(主语或宾语)，从而得到正确的三元组。总的来说，我们提出了一个在关系约束下识别三元组中实体语义角色的模型，可以有效地解决冗余预测问题。我们还在两个广泛使用的公共数据集上对我们的模型进行了评估，我们的模型在NYT和WebNLG上分别获得了90.8和92.4的F1分，取得了较好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Cognitive Robotics

CiteScore

8.40

自引率

0.00%

发文量