基于多轮自动问答的电力技术标准实体关系提取方法

Shiqing Wang, Peng Wang, Sunan Jiang, Feng Shang
{"title":"基于多轮自动问答的电力技术标准实体关系提取方法","authors":"Shiqing Wang, Peng Wang, Sunan Jiang, Feng Shang","doi":"10.1117/12.3004580","DOIUrl":null,"url":null,"abstract":"The goal of entity relation extraction is to extract entities and the relations between entities from unstructured texts. Most of the existing research approaches are oriented towards common entity labels in general-purpose domains (e.g., time, place, person, institution, etc.) and simple texts in specialized domains (texts consisting of single sentences with low knowledge density). The existence of long-distance dependencies of entity pairs (cross-sentence entity pairs) and the phenomenon of overlapping relations (different relations sharing the same entity) in complex texts are ignored. However, complex texts are common in practical applications, especially in professional fields such as power technology standards, where the knowledge density of texts is high and the phenomenon of entity-pair cross-sentence dependency is significant. To solve the above problems, this paper proposes a novel multi-hop automatic question-and-answer-based entity relation extraction method, which combines the current well-established machine reading comprehension framework with the automatic question construction mechanism proposed in this paper, and uses the a priori knowledge provided by the question as the extraction type guide, and uses the multi-hop question-and-answer mechanism to reason about the answer span of the question, effectively alleviating the phenomena of overlapping relations and entity dependence on crosssentences in complex texts. We conducted extensive comparative experiments on the power technology standard dataset self-constructed in this paper, and the results show that the MT-auQA model proposed in this paper achieves optimal performance","PeriodicalId":143265,"journal":{"name":"6th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE 2023)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-turn automatic question answering based entity-relation extraction method for power technology standards\",\"authors\":\"Shiqing Wang, Peng Wang, Sunan Jiang, Feng Shang\",\"doi\":\"10.1117/12.3004580\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The goal of entity relation extraction is to extract entities and the relations between entities from unstructured texts. Most of the existing research approaches are oriented towards common entity labels in general-purpose domains (e.g., time, place, person, institution, etc.) and simple texts in specialized domains (texts consisting of single sentences with low knowledge density). The existence of long-distance dependencies of entity pairs (cross-sentence entity pairs) and the phenomenon of overlapping relations (different relations sharing the same entity) in complex texts are ignored. However, complex texts are common in practical applications, especially in professional fields such as power technology standards, where the knowledge density of texts is high and the phenomenon of entity-pair cross-sentence dependency is significant. To solve the above problems, this paper proposes a novel multi-hop automatic question-and-answer-based entity relation extraction method, which combines the current well-established machine reading comprehension framework with the automatic question construction mechanism proposed in this paper, and uses the a priori knowledge provided by the question as the extraction type guide, and uses the multi-hop question-and-answer mechanism to reason about the answer span of the question, effectively alleviating the phenomena of overlapping relations and entity dependence on crosssentences in complex texts. We conducted extensive comparative experiments on the power technology standard dataset self-constructed in this paper, and the results show that the MT-auQA model proposed in this paper achieves optimal performance\",\"PeriodicalId\":143265,\"journal\":{\"name\":\"6th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE 2023)\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"6th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE 2023)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.3004580\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"6th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE 2023)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3004580","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

实体关系抽取的目标是从非结构化文本中抽取实体和实体之间的关系。现有的研究方法大多面向通用领域(如时间、地点、人、机构等)的通用实体标签和专业领域(由单句组成的低知识密度文本)的简单文本。忽略了复杂文本中实体对的远距离依赖(跨句实体对)和重叠关系(不同关系共享同一实体)现象。然而,在实际应用中,复杂文本是常见的,特别是在电力技术标准等专业领域,文本的知识密度较高,实体对跨句依赖现象显著。针对上述问题,本文提出了一种新的基于多跳自动问答的实体关系抽取方法,该方法将目前完善的机器阅读理解框架与本文提出的自动问题构建机制相结合,以问题提供的先验知识作为抽取类型导向,利用多跳问答机制对问题的答案跨度进行推理。有效缓解复杂文本中交叉句的重叠关系和实体依赖现象。我们在本文自建的电力技术标准数据集上进行了广泛的对比实验,结果表明本文提出的MT-auQA模型达到了最优性能
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-turn automatic question answering based entity-relation extraction method for power technology standards
The goal of entity relation extraction is to extract entities and the relations between entities from unstructured texts. Most of the existing research approaches are oriented towards common entity labels in general-purpose domains (e.g., time, place, person, institution, etc.) and simple texts in specialized domains (texts consisting of single sentences with low knowledge density). The existence of long-distance dependencies of entity pairs (cross-sentence entity pairs) and the phenomenon of overlapping relations (different relations sharing the same entity) in complex texts are ignored. However, complex texts are common in practical applications, especially in professional fields such as power technology standards, where the knowledge density of texts is high and the phenomenon of entity-pair cross-sentence dependency is significant. To solve the above problems, this paper proposes a novel multi-hop automatic question-and-answer-based entity relation extraction method, which combines the current well-established machine reading comprehension framework with the automatic question construction mechanism proposed in this paper, and uses the a priori knowledge provided by the question as the extraction type guide, and uses the multi-hop question-and-answer mechanism to reason about the answer span of the question, effectively alleviating the phenomena of overlapping relations and entity dependence on crosssentences in complex texts. We conducted extensive comparative experiments on the power technology standard dataset self-constructed in this paper, and the results show that the MT-auQA model proposed in this paper achieves optimal performance
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信