一种基于bert的实体与关系联合抽取的高级分解方法

Changhai Wang, Aiping Li, Hongkui Tu, Ye Wang, Chenchen Li, Xiaojuan Zhao
{"title":"一种基于bert的实体与关系联合抽取的高级分解方法","authors":"Changhai Wang, Aiping Li, Hongkui Tu, Ye Wang, Chenchen Li, Xiaojuan Zhao","doi":"10.1109/DSC50466.2020.00021","DOIUrl":null,"url":null,"abstract":"Joint extraction of entities and relations is an important task in the field of natural language processing and the basis of many NLP high-level tasks. However, most existing joint models cannot solve the problem of overlapping triples well. We propose an efficient end-to-end model for joint extraction of entities and overlapping relations. Firstly, the BERT pre-training model is introduced to model the text more finely. Next, We decompose triples extraction into two subtasks: head entity extraction and tail entity extraction, which solves the problem of single entity overlap in the triples. Then, We divide the tail entity extraction into three parallel extraction sub-processes to solve entity pair overlap problem of triples, that is the relation overlap problem. Finally, We transform each extraction sub-process into sequence tag task. We evaluate our model on the New York Times (NYT) dataset and achieve overwhelming results compared with most of the current models, Precise =0.870, Recall = 0.851, and F1 = 0.860. The experimental results show that our model is effective in dealing with triples overlap problem.","PeriodicalId":423182,"journal":{"name":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"An Advanced BERT-Based Decomposition Method for Joint Extraction of Entities and Relations\",\"authors\":\"Changhai Wang, Aiping Li, Hongkui Tu, Ye Wang, Chenchen Li, Xiaojuan Zhao\",\"doi\":\"10.1109/DSC50466.2020.00021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Joint extraction of entities and relations is an important task in the field of natural language processing and the basis of many NLP high-level tasks. However, most existing joint models cannot solve the problem of overlapping triples well. We propose an efficient end-to-end model for joint extraction of entities and overlapping relations. Firstly, the BERT pre-training model is introduced to model the text more finely. Next, We decompose triples extraction into two subtasks: head entity extraction and tail entity extraction, which solves the problem of single entity overlap in the triples. Then, We divide the tail entity extraction into three parallel extraction sub-processes to solve entity pair overlap problem of triples, that is the relation overlap problem. Finally, We transform each extraction sub-process into sequence tag task. We evaluate our model on the New York Times (NYT) dataset and achieve overwhelming results compared with most of the current models, Precise =0.870, Recall = 0.851, and F1 = 0.860. The experimental results show that our model is effective in dealing with triples overlap problem.\",\"PeriodicalId\":423182,\"journal\":{\"name\":\"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSC50466.2020.00021\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSC50466.2020.00021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

实体和关系的联合抽取是自然语言处理领域的一项重要任务,也是许多NLP高级任务的基础。然而,现有的关节模型大多不能很好地解决重叠三元组的问题。我们提出了一个有效的端到端模型,用于实体和重叠关系的联合提取。首先,引入BERT预训练模型对文本进行更精细的建模;接下来,我们将三元组提取分解为两个子任务:头部实体提取和尾部实体提取,解决了三元组中单个实体重叠的问题。然后,将尾实体提取分为三个并行提取子过程,解决三元组实体对重叠问题,即关系重叠问题。最后,将每个提取子过程转化为序列标记任务。我们在纽约时报(NYT)数据集上评估了我们的模型,与大多数现有模型相比,我们获得了压倒性的结果,Precise =0.870, Recall = 0.851, F1 = 0.860。实验结果表明,该模型能有效地处理三元组重叠问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An Advanced BERT-Based Decomposition Method for Joint Extraction of Entities and Relations
Joint extraction of entities and relations is an important task in the field of natural language processing and the basis of many NLP high-level tasks. However, most existing joint models cannot solve the problem of overlapping triples well. We propose an efficient end-to-end model for joint extraction of entities and overlapping relations. Firstly, the BERT pre-training model is introduced to model the text more finely. Next, We decompose triples extraction into two subtasks: head entity extraction and tail entity extraction, which solves the problem of single entity overlap in the triples. Then, We divide the tail entity extraction into three parallel extraction sub-processes to solve entity pair overlap problem of triples, that is the relation overlap problem. Finally, We transform each extraction sub-process into sequence tag task. We evaluate our model on the New York Times (NYT) dataset and achieve overwhelming results compared with most of the current models, Precise =0.870, Recall = 0.851, and F1 = 0.860. The experimental results show that our model is effective in dealing with triples overlap problem.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信