Effectiveness of Transformer-Based Large Language Models in Identifying Adverse Drug Reaction Relations from Unstructured Discharge Summaries in Singapore.

IF 3.8 2区医学 Q1 PHARMACOLOGY & PHARMACY

Drug Safety Pub Date : 2025-06-01 Epub Date: 2025-02-21 DOI:10.1007/s40264-025-01525-w

Yen Ling Koon, Yan Tung Lam, Hui Xing Tan, Desmond Hwee Chun Teo, Jing Wei Neo, Aaron Jun Yi Yap, Pei San Ang, Celine Ping Wei Loke, Mun Yee Tham, Siew Har Tan, Sally Leng Bee Soh, Belinda Qin Pei Foo, Zheng Jye Ling, James Luen Wei Yip, Sreemanee Raaj Dorajoo

{"title":"Effectiveness of Transformer-Based Large Language Models in Identifying Adverse Drug Reaction Relations from Unstructured Discharge Summaries in Singapore.","authors":"Yen Ling Koon, Yan Tung Lam, Hui Xing Tan, Desmond Hwee Chun Teo, Jing Wei Neo, Aaron Jun Yi Yap, Pei San Ang, Celine Ping Wei Loke, Mun Yee Tham, Siew Har Tan, Sally Leng Bee Soh, Belinda Qin Pei Foo, Zheng Jye Ling, James Luen Wei Yip, Sreemanee Raaj Dorajoo","doi":"10.1007/s40264-025-01525-w","DOIUrl":null,"url":null,"abstract":"Introduction: Transformer-based large language models (LLMs) have transformed the field of natural language processing and led to significant advancements in various text processing tasks. However, the applicability of these LLMs in identifying related drug-adverse event (AE) pairs within clinical context may be limited by the prevalent use of non-standard sentence structures and grammar.Method: Nine transformer-based LLMs pre-trained on biomedical domain corpora are fine-tuned on annotated data (n = 5088) to classify drug-AE pairs in unstructured discharge summaries as causally related or unrelated. These LLMs are then validated on text segments from deidentified hospital discharge summaries from Singapore (n = 1647). To assess generalisability, the models are validated on annotated segments (n = 4418) from the Medical Information Mart for Intensive Care (MIMIC-III) database. Performance of LLMs in identifying related drug-AE pairs is then compared against a prior benchmark set by traditional machine learning models on the same data.Results: Using an LLM-Bidirectional long short-term memory (LLM-BiLSTM) architecture, transformer-based LLMs improve F1 score as compared to prior benchmark with BioM-ELECTRA-Large-BiLSTM showing an average F1 score improvement of 16.1% (increase from 0.64 to 0.74). Applying additional rules on the LLM-based predictions, like ignoring drug-AE pairs when the AE is a known indication of the drug, results in a further reduction in false positive rates with precision increases of up to 5.6% (0.04 increment).Conclusion: Transformer-based LLMs outperform traditional machine learning methods in identifying causally related drug-AE pairs embedded within unstructured discharge summaries. Nonetheless the improvement in performance with rules indicates that LLMs still possess some degree of imperfection for this causal relation detection task.","PeriodicalId":11382,"journal":{"name":"Drug Safety","volume":" ","pages":"667-677"},"PeriodicalIF":3.8000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Drug Safety","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s40264-025-01525-w","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/21 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: Transformer-based large language models (LLMs) have transformed the field of natural language processing and led to significant advancements in various text processing tasks. However, the applicability of these LLMs in identifying related drug-adverse event (AE) pairs within clinical context may be limited by the prevalent use of non-standard sentence structures and grammar.

Method: Nine transformer-based LLMs pre-trained on biomedical domain corpora are fine-tuned on annotated data (n = 5088) to classify drug-AE pairs in unstructured discharge summaries as causally related or unrelated. These LLMs are then validated on text segments from deidentified hospital discharge summaries from Singapore (n = 1647). To assess generalisability, the models are validated on annotated segments (n = 4418) from the Medical Information Mart for Intensive Care (MIMIC-III) database. Performance of LLMs in identifying related drug-AE pairs is then compared against a prior benchmark set by traditional machine learning models on the same data.

Results: Using an LLM-Bidirectional long short-term memory (LLM-BiLSTM) architecture, transformer-based LLMs improve F1 score as compared to prior benchmark with BioM-ELECTRA-Large-BiLSTM showing an average F1 score improvement of 16.1% (increase from 0.64 to 0.74). Applying additional rules on the LLM-based predictions, like ignoring drug-AE pairs when the AE is a known indication of the drug, results in a further reduction in false positive rates with precision increases of up to 5.6% (0.04 increment).

Conclusion: Transformer-based LLMs outperform traditional machine learning methods in identifying causally related drug-AE pairs embedded within unstructured discharge summaries. Nonetheless the improvement in performance with rules indicates that LLMs still possess some degree of imperfection for this causal relation detection task.

查看原文本刊更多论文

基于变压器的大型语言模型在新加坡非结构化出院摘要中识别药物不良反应关系的有效性。

基于转换器的大型语言模型（llm）已经改变了自然语言处理领域，并在各种文本处理任务中取得了重大进展。然而，这些法学硕士在临床环境中识别相关药物不良事件（AE）对的适用性可能受到普遍使用的非标准句子结构和语法的限制。方法：在生物医学领域语料库上预先训练的9个基于变压器的llm在注释数据（n = 5088）上进行微调，将非结构化出院摘要中的药物ae对分类为因果相关或不相关。然后对来自新加坡的未识别医院出院摘要的文本片段验证这些llm （n = 1647）。为了评估通用性，在重症医疗信息市场（MIMIC-III）数据库中的注释片段（n = 4418）上验证了模型。然后将llm在识别相关药物ae对方面的性能与传统机器学习模型在相同数据上设置的先前基准进行比较。结果：使用llm -双向长短期记忆（LLM-BiLSTM）架构，基于变压器的llm与之前使用BioM-ELECTRA-Large-BiLSTM的基准相比，F1分数平均提高了16.1%（从0.64增加到0.74）。对基于llm的预测应用额外的规则，例如当AE是药物的已知适应症时忽略药物-AE对，导致假阳性率进一步降低，精度提高到5.6%（增加0.04）。结论：基于transformer的llm在识别非结构化出院摘要中嵌入的因果相关药物ae对方面优于传统的机器学习方法。尽管如此，使用规则的性能提高表明llm在这个因果关系检测任务上仍然存在一定程度的不完善。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Drug Safety 医学-毒理学

CiteScore

7.60

自引率

7.10%

发文量

112

审稿时长

6-12 weeks

期刊介绍： Drug Safety is the official journal of the International Society of Pharmacovigilance. The journal includes: Overviews of contentious or emerging issues. Comprehensive narrative reviews that provide an authoritative source of information on epidemiology, clinical features, prevention and management of adverse effects of individual drugs and drug classes. In-depth benefit-risk assessment of adverse effect and efficacy data for a drug in a defined therapeutic area. Systematic reviews (with or without meta-analyses) that collate empirical evidence to answer a specific research question, using explicit, systematic methods as outlined by the PRISMA statement. Original research articles reporting the results of well-designed studies in disciplines such as pharmacoepidemiology, pharmacovigilance, pharmacology and toxicology, and pharmacogenomics. Editorials and commentaries on topical issues. Additional digital features (including animated abstracts, video abstracts, slide decks, audio slides, instructional videos, infographics, podcasts and animations) can be published with articles; these are designed to increase the visibility, readership and educational value of the journal’s content. In addition, articles published in Drug Safety Drugs may be accompanied by plain language summaries to assist readers who have some knowledge of, but not in-depth expertise in, the area to understand important medical advances.