用表格填充法联合提取科学文本中的事实条件陈述和超级关系

IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Qizhi Chen , Hong Yao , Diange Zhou
{"title":"用表格填充法联合提取科学文本中的事实条件陈述和超级关系","authors":"Qizhi Chen ,&nbsp;Hong Yao ,&nbsp;Diange Zhou","doi":"10.1016/j.ipm.2024.103906","DOIUrl":null,"url":null,"abstract":"<div><div>The fact-condition statements are of great significance in scientific text, via which the natural phenomenon and its precondition are detailly recorded. In previous study, the extraction of fact-condition statement and their relation (super relation) from scientific text is designed as a pipeline that the fact-condition statement and super relation are extracted successively, which leads to the error propagation and lowers the accuracy. To solve this problem, the table filling method is firstly adopted for joint extraction of fact-condition statement and super relation, and the Biaffine Convolution Neural Network model (BCNN) is proposed to complete the task. In the BCNN, the pretrained language model and Biaffine Neural Network work as the encoder, while the Convolution Neural Network is added into the model as the decoder that enhances the local semantic information. Benefiting from the local semantic enhancement, the BCNN achieves the best F1 score with different pretrained language models in comparison with other baselines. Its F1 scores in GeothCF (geological text) reach 73.17% and 71.04% with BERT and SciBERT as pretrained language model, respectively. Moreover, the local semantic enhancement also increases its training efficiency, via which the tags’ distribution can be more easily learned by the model. Besides, the BCNN trained with GeothCF also exhibits the best performance in BioCF (biomedical text), which indicates that it can be widely applied for the information extraction in all scientific domains. Finally, the geological fact-condition knowledge graph is built with BCNN, showing a new pipeline for construction of scientific fact-condition knowledge graph.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103906"},"PeriodicalIF":7.4000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The joint extraction of fact-condition statement and super relation in scientific text with table filling method\",\"authors\":\"Qizhi Chen ,&nbsp;Hong Yao ,&nbsp;Diange Zhou\",\"doi\":\"10.1016/j.ipm.2024.103906\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The fact-condition statements are of great significance in scientific text, via which the natural phenomenon and its precondition are detailly recorded. In previous study, the extraction of fact-condition statement and their relation (super relation) from scientific text is designed as a pipeline that the fact-condition statement and super relation are extracted successively, which leads to the error propagation and lowers the accuracy. To solve this problem, the table filling method is firstly adopted for joint extraction of fact-condition statement and super relation, and the Biaffine Convolution Neural Network model (BCNN) is proposed to complete the task. In the BCNN, the pretrained language model and Biaffine Neural Network work as the encoder, while the Convolution Neural Network is added into the model as the decoder that enhances the local semantic information. Benefiting from the local semantic enhancement, the BCNN achieves the best F1 score with different pretrained language models in comparison with other baselines. Its F1 scores in GeothCF (geological text) reach 73.17% and 71.04% with BERT and SciBERT as pretrained language model, respectively. Moreover, the local semantic enhancement also increases its training efficiency, via which the tags’ distribution can be more easily learned by the model. Besides, the BCNN trained with GeothCF also exhibits the best performance in BioCF (biomedical text), which indicates that it can be widely applied for the information extraction in all scientific domains. Finally, the geological fact-condition knowledge graph is built with BCNN, showing a new pipeline for construction of scientific fact-condition knowledge graph.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 1\",\"pages\":\"Article 103906\"},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457324002656\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324002656","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

事实条件语句在科学文本中具有重要意义,通过它可以详细记录自然现象及其前提条件。在以往的研究中,从科学文本中提取事实条件语句及其关系(超关系)被设计成一个流水线,即先后提取事实条件语句和超关系,这导致了错误的传播,降低了准确性。为解决这一问题,首先采用表格填充法对事实条件语句和超级关系进行联合提取,并提出了双峰卷积神经网络(Biaffine Convolution Neural Network,BCNN)模型来完成这一任务。在 BCNN 中,预训练的语言模型和 Biaffine 神经网络作为编码器工作,而卷积神经网络则作为解码器加入到模型中,以增强局部语义信息。得益于局部语义增强,与其他基线相比,BCNN 在不同的预训练语言模型中取得了最好的 F1 分数。在使用 BERT 和 SciBERT 作为预训练语言模型时,BCNN 在 GeothCF(地质文本)中的 F1 分数分别达到 73.17% 和 71.04%。此外,局部语义增强也提高了其训练效率,通过这种方法,模型可以更容易地学习标签的分布。此外,用 GeothCF 训练的 BCNN 在 BioCF(生物医学文本)中也表现出了最佳性能,这表明它可以广泛应用于所有科学领域的信息提取。最后,利用 BCNN 构建了地质事实条件知识图谱,为科学事实条件知识图谱的构建提供了新的管道。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The joint extraction of fact-condition statement and super relation in scientific text with table filling method
The fact-condition statements are of great significance in scientific text, via which the natural phenomenon and its precondition are detailly recorded. In previous study, the extraction of fact-condition statement and their relation (super relation) from scientific text is designed as a pipeline that the fact-condition statement and super relation are extracted successively, which leads to the error propagation and lowers the accuracy. To solve this problem, the table filling method is firstly adopted for joint extraction of fact-condition statement and super relation, and the Biaffine Convolution Neural Network model (BCNN) is proposed to complete the task. In the BCNN, the pretrained language model and Biaffine Neural Network work as the encoder, while the Convolution Neural Network is added into the model as the decoder that enhances the local semantic information. Benefiting from the local semantic enhancement, the BCNN achieves the best F1 score with different pretrained language models in comparison with other baselines. Its F1 scores in GeothCF (geological text) reach 73.17% and 71.04% with BERT and SciBERT as pretrained language model, respectively. Moreover, the local semantic enhancement also increases its training efficiency, via which the tags’ distribution can be more easily learned by the model. Besides, the BCNN trained with GeothCF also exhibits the best performance in BioCF (biomedical text), which indicates that it can be widely applied for the information extraction in all scientific domains. Finally, the geological fact-condition knowledge graph is built with BCNN, showing a new pipeline for construction of scientific fact-condition knowledge graph.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Processing & Management
Information Processing & Management 工程技术-计算机:信息系统
CiteScore
17.00
自引率
11.60%
发文量
276
审稿时长
39 days
期刊介绍: Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信