BAMRE: Joint extraction model of Chinese medical entities and relations based on Biaffine transformation with relation attention

IF 4.5 2区医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Journal of Biomedical Informatics Pub Date : 2024-10-01 DOI:10.1016/j.jbi.2024.104733

Jiaqi Sun , Chen Zhang , Linlin Xing , Longbo Zhang , Hongzhen Cai , Maozu Guo

{"title":"BAMRE: Joint extraction model of Chinese medical entities and relations based on Biaffine transformation with relation attention","authors":"Jiaqi Sun , Chen Zhang , Linlin Xing , Longbo Zhang , Hongzhen Cai , Maozu Guo","doi":"10.1016/j.jbi.2024.104733","DOIUrl":null,"url":null,"abstract":"<div><div>Electronic Health Records (EHRs) contain various valuable medical entities and their relationships. Although the extraction of biomedical relationships has achieved good results in the mining of electronic health records and the construction of biomedical knowledge bases, there are still some problems. There may be implied complex associations between entities and relationships in overlapping triplets, and ignoring these interactions may lead to a decrease in the accuracy of entity extraction. To address this issue, a joint extraction model for medical entity relations based on a relation attention mechanism is proposed. The relation extraction module identifies candidate relationships within a sentence. The attention mechanism based on these relationships assigns weights to contextual words in the sentence that are associated with different relationships. Additionally, it extracts the subject and object entities. Under a specific relationship, entity vector representations are utilized to construct a global entity matching matrix based on Biaffine transformations. This matrix is designed to enhance the semantic dependencies and relational representations between entities, enabling triplet extraction. This allows the two subtasks of named entity recognition and relation extraction to be interrelated, fully utilizing contextual information within the sentence, and effectively addresses the issue of overlapping triplets.</div><div>Experimental observations from the CMeIE Chinese medical relation extraction dataset and the Baidu2019 Chinese dataset confirm that our approach yields the superior <span><math><mrow><mi>F</mi><mn>1</mn></mrow></math></span> score across all cutting-edge baselines. Moreover, it offers substantial performance improvements in intricate situations involving diverse overlapping patterns, multitudes of triplets, and cross-sentence triplets.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"158 ","pages":"Article 104733"},"PeriodicalIF":4.5000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomedical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1532046424001515","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Electronic Health Records (EHRs) contain various valuable medical entities and their relationships. Although the extraction of biomedical relationships has achieved good results in the mining of electronic health records and the construction of biomedical knowledge bases, there are still some problems. There may be implied complex associations between entities and relationships in overlapping triplets, and ignoring these interactions may lead to a decrease in the accuracy of entity extraction. To address this issue, a joint extraction model for medical entity relations based on a relation attention mechanism is proposed. The relation extraction module identifies candidate relationships within a sentence. The attention mechanism based on these relationships assigns weights to contextual words in the sentence that are associated with different relationships. Additionally, it extracts the subject and object entities. Under a specific relationship, entity vector representations are utilized to construct a global entity matching matrix based on Biaffine transformations. This matrix is designed to enhance the semantic dependencies and relational representations between entities, enabling triplet extraction. This allows the two subtasks of named entity recognition and relation extraction to be interrelated, fully utilizing contextual information within the sentence, and effectively addresses the issue of overlapping triplets.

Experimental observations from the CMeIE Chinese medical relation extraction dataset and the Baidu2019 Chinese dataset confirm that our approach yields the superior

F 1

score across all cutting-edge baselines. Moreover, it offers substantial performance improvements in intricate situations involving diverse overlapping patterns, multitudes of triplets, and cross-sentence triplets.

Abstract Image

查看原文本刊更多论文

BAMRE：基于 Biaffine 变换和关系关注的中医实体和关系联合提取模型。

电子健康记录（EHR）包含各种有价值的医疗实体及其关系。尽管生物医学关系的提取在电子健康记录的挖掘和生物医学知识库的构建中取得了良好的效果，但仍然存在一些问题。在重叠的三元组中，实体和关系之间可能隐含着复杂的关联，忽略这些相互作用可能会导致实体提取的准确性下降。为了解决这个问题，本文提出了一种基于关系关注机制的医学实体关系联合提取模型。关系提取模块可识别句子中的候选关系。基于这些关系的关注机制为句子中与不同关系相关联的上下文词语分配权重。此外，它还能提取主语和宾语实体。在特定关系下，实体向量表示法被用来构建基于 Biaffine 变换的全局实体匹配矩阵。该矩阵旨在增强实体间的语义依赖性和关系表征，从而实现三元组提取。这使得命名实体识别和关系提取这两项子任务相互关联，充分利用了句子中的上下文信息，有效解决了三元组重叠的问题。来自 CMeIE 中文医疗关系提取数据集和百度 2019 中文数据集的实验观察证实，我们的方法在所有前沿基线中都获得了更优的 F1 分数。此外，在涉及各种重叠模式、大量三元组和跨句子三元组的复杂情况下，它的性能也得到了大幅提升。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Biomedical Informatics 医学-计算机：跨学科应用

CiteScore

8.90

自引率

6.70%

发文量

243

审稿时长

32 days

期刊介绍： The Journal of Biomedical Informatics reflects a commitment to high-quality original research papers, reviews, and commentaries in the area of biomedical informatics methodology. Although we publish articles motivated by applications in the biomedical sciences (for example, clinical medicine, health care, population health, and translational bioinformatics), the journal emphasizes reports of new methodologies and techniques that have general applicability and that form the basis for the evolving science of biomedical informatics. Articles on medical devices; evaluations of implemented systems (including clinical trials of information technologies); or papers that provide insight into a biological process, a specific disease, or treatment options would generally be more suitable for publication in other venues. Papers on applications of signal processing and image analysis are often more suitable for biomedical engineering journals or other informatics journals, although we do publish papers that emphasize the information management and knowledge representation/modeling issues that arise in the storage and use of biological signals and images. System descriptions are welcome if they illustrate and substantiate the underlying methodology that is the principal focus of the report and an effort is made to address the generalizability and/or range of application of that methodology. Note also that, given the international nature of JBI, papers that deal with specific languages other than English, or with country-specific health systems or approaches, are acceptable for JBI only if they offer generalizable lessons that are relevant to the broad JBI readership, regardless of their country, language, culture, or health system.