End-to-End n-ary Relation Extraction for Combination Drug Therapies.

IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics Pub Date : 2023-06-01 Epub Date: 2023-12-11 DOI:10.1109/ichi57859.2023.00021

Yuhang Jiang, Ramakanth Kavuluru

{"title":"End-to-End n-ary Relation Extraction for Combination Drug Therapies.","authors":"Yuhang Jiang, Ramakanth Kavuluru","doi":"10.1109/ichi57859.2023.00021","DOIUrl":null,"url":null,"abstract":"Combination drug therapies are treatment regimens that involve two or more drugs, administered more commonly for patients with cancer, HIV, malaria, or tuberculosis. Currently there are over 350K articles in PubMed that use the combination drug therapy MeSH heading with at least 10K articles published per year over the past two decades. Extracting combination therapies from scientific literature inherently constitutes an n-ary relation extraction problem. Unlike in the general n-ary setting where n is fixed (e.g., drug-gene-mutation relations where n = 3), extracting combination therapies is a special setting where n ≥ 2 is dynamic, depending on each instance. Recently, Tiktinsky et al. (NAACL 2022) introduced a first of its kind dataset, CombDrugExt, for extracting such therapies from literature. Here, we use a sequence-to-sequence style end-to-end extraction method to achieve an F1-Score of 66.7% on the CombDrugExt test set for positive (or effective) combinations. This is an absolute ≈ 5% F1-score improvement even over the prior best relation classification score with spotted drug entities (hence, not end-to-end). Thus our effort introduces a state-of-the-art first model for end-to-end extraction that is already superior to the best prior non end-to-end model for this task. Our model seamlessly extracts all drug entities and relations in a single pass and is highly suitable for dynamic n-ary extraction scenarios.","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":"2023 ","pages":"72-80"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10814995/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ichi57859.2023.00021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/12/11 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Combination drug therapies are treatment regimens that involve two or more drugs, administered more commonly for patients with cancer, HIV, malaria, or tuberculosis. Currently there are over 350K articles in PubMed that use the combination drug therapy MeSH heading with at least 10K articles published per year over the past two decades. Extracting combination therapies from scientific literature inherently constitutes an n-ary relation extraction problem. Unlike in the general n-ary setting where n is fixed (e.g., drug-gene-mutation relations where n = 3), extracting combination therapies is a special setting where n ≥ 2 is dynamic, depending on each instance. Recently, Tiktinsky et al. (NAACL 2022) introduced a first of its kind dataset, CombDrugExt, for extracting such therapies from literature. Here, we use a sequence-to-sequence style end-to-end extraction method to achieve an F1-Score of 66.7% on the CombDrugExt test set for positive (or effective) combinations. This is an absolute ≈ 5% F1-score improvement even over the prior best relation classification score with spotted drug entities (hence, not end-to-end). Thus our effort introduces a state-of-the-art first model for end-to-end extraction that is already superior to the best prior non end-to-end model for this task. Our model seamlessly extracts all drug entities and relations in a single pass and is highly suitable for dynamic n-ary extraction scenarios.

查看原文本刊更多论文

联合药物疗法的端到端 nary 关系提取。

联合药物疗法是一种涉及两种或两种以上药物的治疗方案，通常用于治疗癌症、艾滋病、疟疾或结核病患者。目前，PubMed 上有超过 35 万篇使用联合药物疗法 MeSH 标题的文章，在过去二十年中，每年至少有 1 万篇文章发表。从科学文献中提取联合疗法本身就构成了一个 n-ary 关系提取问题。在一般的 n-ary 环境中，n 是固定的（例如，n = 3 的药物基因突变关系），而提取联合疗法则不同，在这种特殊环境中，n ≥ 2 是动态的，取决于每个实例。最近，Tiktinsky 等人（NAACL 2022）推出了首个从文献中提取此类疗法的数据集 CombDrugExt。在这里，我们使用了一种序列到序列式的端到端提取方法，在 CombDrugExt 测试集上，阳性（或有效）组合的 F1 分数达到了 66.7%。即使与之前使用斑点药物实体（因此不是端到端）的最佳关系分类得分相比，F1 分数的绝对值也提高了 ≈ 5%。因此，我们的努力为端到端提取引入了最先进的首个模型，该模型已经优于之前用于该任务的最佳非端到端模型。我们的模型能一次性无缝提取所有药物实体和关系，非常适合动态 n-ary 提取场景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics

自引率

0.00%

发文量