Towards discovery: an end-to-end system for uncovering novel biomedical relations.

IF 4.3 3区 材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Tiago Almeida, Richard A A Jonker, Rui Antunes, João R Almeida, Sérgio Matos
{"title":"Towards discovery: an end-to-end system for uncovering novel biomedical relations.","authors":"Tiago Almeida, Richard A A Jonker, Rui Antunes, João R Almeida, Sérgio Matos","doi":"10.1093/database/baae057","DOIUrl":null,"url":null,"abstract":"<p><p>Biomedical relation extraction is an ongoing challenge within the natural language processing community. Its application is important for understanding scientific biomedical literature, with many use cases, such as drug discovery, precision medicine, disease diagnosis, treatment optimization and biomedical knowledge graph construction. Therefore, the development of a tool capable of effectively addressing this task holds the potential to improve knowledge discovery by automating the extraction of relations from research manuscripts. The first track in the BioCreative VIII competition extended the scope of this challenge by introducing the detection of novel relations within the literature. This paper describes that our participation system initially focused on jointly extracting and classifying novel relations between biomedical entities. We then describe our subsequent advancement to an end-to-end model. Specifically, we enhanced our initial system by incorporating it into a cascading pipeline that includes a tagger and linker module. This integration enables the comprehensive extraction of relations and classification of their novelty directly from raw text. Our experiments yielded promising results, and our tagger module managed to attain state-of-the-art named entity recognition performance, with a micro F1-score of 90.24, while our end-to-end system achieved a competitive novelty F1-score of 24.59. The code to run our system is publicly available at https://github.com/ieeta-pt/BioNExt. Database URL: https://github.com/ieeta-pt/BioNExt.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11240158/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/database/baae057","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Biomedical relation extraction is an ongoing challenge within the natural language processing community. Its application is important for understanding scientific biomedical literature, with many use cases, such as drug discovery, precision medicine, disease diagnosis, treatment optimization and biomedical knowledge graph construction. Therefore, the development of a tool capable of effectively addressing this task holds the potential to improve knowledge discovery by automating the extraction of relations from research manuscripts. The first track in the BioCreative VIII competition extended the scope of this challenge by introducing the detection of novel relations within the literature. This paper describes that our participation system initially focused on jointly extracting and classifying novel relations between biomedical entities. We then describe our subsequent advancement to an end-to-end model. Specifically, we enhanced our initial system by incorporating it into a cascading pipeline that includes a tagger and linker module. This integration enables the comprehensive extraction of relations and classification of their novelty directly from raw text. Our experiments yielded promising results, and our tagger module managed to attain state-of-the-art named entity recognition performance, with a micro F1-score of 90.24, while our end-to-end system achieved a competitive novelty F1-score of 24.59. The code to run our system is publicly available at https://github.com/ieeta-pt/BioNExt. Database URL: https://github.com/ieeta-pt/BioNExt.

走向发现:发现新型生物医学关系的端到端系统。
生物医学关系提取是自然语言处理界的一项持续挑战。它的应用对于理解生物医学科学文献非常重要,有很多用例,如药物发现、精准医疗、疾病诊断、治疗优化和生物医学知识图谱构建。因此,开发一种能够有效解决这一任务的工具,有望通过自动提取研究手稿中的关系来改进知识发现。BioCreative VIII 竞赛的第一个赛道扩展了这一挑战的范围,引入了文献中新型关系的检测。本文介绍了我们的参赛系统最初侧重于联合提取生物医学实体之间的新型关系并对其进行分类。然后,我们介绍了我们随后向端到端模型的发展。具体来说,我们将最初的系统纳入了一个级联管道,其中包括一个标记和链接模块,从而增强了系统的功能。通过这种整合,我们可以直接从原始文本中全面提取关系并对其新颖性进行分类。我们的实验取得了可喜的成果,我们的标记模块达到了最先进的命名实体识别性能,微观 F1 分数为 90.24,而我们的端到端系统的新颖性 F1 分数为 24.59。运行我们系统的代码可通过 https://github.com/ieeta-pt/BioNExt 公开获取。数据库网址:https://github.com/ieeta-pt/BioNExt。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.20
自引率
4.30%
发文量
567
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信