{"title":"End-to-End <i>n</i>-ary Relation Extraction for Combination Drug Therapies.","authors":"Yuhang Jiang, Ramakanth Kavuluru","doi":"10.1109/ichi57859.2023.00021","DOIUrl":null,"url":null,"abstract":"<p><p>Combination drug therapies are treatment regimens that involve two or more drugs, administered more commonly for patients with cancer, HIV, malaria, or tuberculosis. Currently there are over 350K articles in PubMed that use the <b>combination drug therapy</b> MeSH heading with at least 10K articles published per year over the past two decades. Extracting combination therapies from scientific literature inherently constitutes an <i>n</i>-ary relation extraction problem. Unlike in the general <i>n</i>-ary setting where <i>n</i> is fixed (e.g., drug-gene-mutation relations where <i>n</i> = 3), extracting combination therapies is a special setting where <i>n</i> ≥ 2 is dynamic, depending on each instance. Recently, Tiktinsky et al. (NAACL 2022) introduced a first of its kind dataset, <b>CombDrugExt</b>, for extracting such therapies from literature. Here, we use a sequence-to-sequence style end-to-end extraction method to achieve an F1-Score of 66.7% on the <b>CombDrugExt</b> test set for positive (or effective) combinations. This is an absolute <i>≈</i> 5% F1-score improvement even over the prior best relation classification score with spotted drug entities (hence, not end-to-end). Thus our effort introduces a state-of-the-art first model for end-to-end extraction that is already superior to the best prior non end-to-end model for this task. Our model seamlessly extracts all drug entities and relations in a single pass and is highly suitable for dynamic <i>n</i>-ary extraction scenarios.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10814995/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ichi57859.2023.00021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/12/11 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Combination drug therapies are treatment regimens that involve two or more drugs, administered more commonly for patients with cancer, HIV, malaria, or tuberculosis. Currently there are over 350K articles in PubMed that use the combination drug therapy MeSH heading with at least 10K articles published per year over the past two decades. Extracting combination therapies from scientific literature inherently constitutes an n-ary relation extraction problem. Unlike in the general n-ary setting where n is fixed (e.g., drug-gene-mutation relations where n = 3), extracting combination therapies is a special setting where n ≥ 2 is dynamic, depending on each instance. Recently, Tiktinsky et al. (NAACL 2022) introduced a first of its kind dataset, CombDrugExt, for extracting such therapies from literature. Here, we use a sequence-to-sequence style end-to-end extraction method to achieve an F1-Score of 66.7% on the CombDrugExt test set for positive (or effective) combinations. This is an absolute ≈ 5% F1-score improvement even over the prior best relation classification score with spotted drug entities (hence, not end-to-end). Thus our effort introduces a state-of-the-art first model for end-to-end extraction that is already superior to the best prior non end-to-end model for this task. Our model seamlessly extracts all drug entities and relations in a single pass and is highly suitable for dynamic n-ary extraction scenarios.