释放生成式大语言模型在出院摘要药物不良反应关系预测中的潜力:分析与策略。

IF 5.5 2区 医学 Q1 PHARMACOLOGY & PHARMACY
Yen Ling Koon, Hui Xing Tan, Desmond Chun Hwee Teo, Jing Wei Neo, Pei San Ang, Celine Wei Ping Loke, Mun Yee Tham, Siew Har Tan, Bee Leng Sally Soh, Pei Qin Belinda Foo, Sreemanee Raaj Dorajoo
{"title":"释放生成式大语言模型在出院摘要药物不良反应关系预测中的潜力:分析与策略。","authors":"Yen Ling Koon, Hui Xing Tan, Desmond Chun Hwee Teo, Jing Wei Neo, Pei San Ang, Celine Wei Ping Loke, Mun Yee Tham, Siew Har Tan, Bee Leng Sally Soh, Pei Qin Belinda Foo, Sreemanee Raaj Dorajoo","doi":"10.1002/cpt.70100","DOIUrl":null,"url":null,"abstract":"<p><p>We present a comparative analysis of generative large language models (LLMs) for predicting causal relationships between drugs and adverse events found in text segments from discharge summaries. Despite lacking prior training for identifying related drug-adverse event pairs, generative LLMs demonstrate exceptional performance as recall-optimized models, achieving F1 scores comparable to those of fine-tuned models. Notably, on the MIMIC-Unrestricted dataset, Gemini 1.5 Pro and Llama 3.1 405B outperform our in-house fine-tuned BioM-ELECTRA-Large, with Gemini 1.5 Pro showing a 19.2% (0.724-0.863) improvement in F1 score and a 39.7% (0.675-0.943) increase in recall, while Llama 3.1 405B exhibits a 12.4% (0.724-0.814) improvement in F1 and a 40.4% (0.675-0.948) boost in recall. Additionally, we propose a hybrid approach that integrates BioM-ELECTRA-Large with generative LLMs, resulting in enhanced performance over the individual models. Our hybrid model achieves F1 score improvements ranging from 0.8% to 18.5% (0.005-0.133) over BioM-ELECTRA-Large in the validation set, primarily due to increased precision, albeit with a decrease in recall compared with the original generative LLM. Importantly, this approach yields substantial computational resource savings, as BioM-ELECTRA-Large selects only a subset of segments-ranging from 19.7% to 73.4% across our datasets-for downstream prediction by generative LLMs. By harnessing the strengths of generative LLMs as recall-optimized models and combining them with fine-tuned models, we can unlock the full potential of artificial intelligence in predicting adverse drug reaction relations, ultimately enhancing patient safety.</p>","PeriodicalId":153,"journal":{"name":"Clinical Pharmacology & Therapeutics","volume":" ","pages":""},"PeriodicalIF":5.5000,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unlocking Potential of Generative Large Language Models for Adverse Drug Reaction Relation Prediction in Discharge Summaries: Analysis and Strategy.\",\"authors\":\"Yen Ling Koon, Hui Xing Tan, Desmond Chun Hwee Teo, Jing Wei Neo, Pei San Ang, Celine Wei Ping Loke, Mun Yee Tham, Siew Har Tan, Bee Leng Sally Soh, Pei Qin Belinda Foo, Sreemanee Raaj Dorajoo\",\"doi\":\"10.1002/cpt.70100\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>We present a comparative analysis of generative large language models (LLMs) for predicting causal relationships between drugs and adverse events found in text segments from discharge summaries. Despite lacking prior training for identifying related drug-adverse event pairs, generative LLMs demonstrate exceptional performance as recall-optimized models, achieving F1 scores comparable to those of fine-tuned models. Notably, on the MIMIC-Unrestricted dataset, Gemini 1.5 Pro and Llama 3.1 405B outperform our in-house fine-tuned BioM-ELECTRA-Large, with Gemini 1.5 Pro showing a 19.2% (0.724-0.863) improvement in F1 score and a 39.7% (0.675-0.943) increase in recall, while Llama 3.1 405B exhibits a 12.4% (0.724-0.814) improvement in F1 and a 40.4% (0.675-0.948) boost in recall. Additionally, we propose a hybrid approach that integrates BioM-ELECTRA-Large with generative LLMs, resulting in enhanced performance over the individual models. Our hybrid model achieves F1 score improvements ranging from 0.8% to 18.5% (0.005-0.133) over BioM-ELECTRA-Large in the validation set, primarily due to increased precision, albeit with a decrease in recall compared with the original generative LLM. Importantly, this approach yields substantial computational resource savings, as BioM-ELECTRA-Large selects only a subset of segments-ranging from 19.7% to 73.4% across our datasets-for downstream prediction by generative LLMs. By harnessing the strengths of generative LLMs as recall-optimized models and combining them with fine-tuned models, we can unlock the full potential of artificial intelligence in predicting adverse drug reaction relations, ultimately enhancing patient safety.</p>\",\"PeriodicalId\":153,\"journal\":{\"name\":\"Clinical Pharmacology & Therapeutics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Pharmacology & Therapeutics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1002/cpt.70100\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PHARMACOLOGY & PHARMACY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Pharmacology & Therapeutics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/cpt.70100","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 0

摘要

我们提出了一个生成式大语言模型(LLMs)的比较分析,用于预测药物和出院摘要文本片段中发现的不良事件之间的因果关系。尽管缺乏识别相关药物不良事件对的先前培训,生成式llm作为召回优化模型表现出卓越的性能,其F1得分与微调模型相当。值得注意的是,在MIMIC-Unrestricted数据集上,Gemini 1.5 Pro和Llama 3.1 405B优于我们内部微调的bioma - electra - large, Gemini 1.5 Pro在F1得分上提高了19.2%(0.724-0.863),召回率提高了39.7%(0.675-0.943),而Llama 3.1 405B在F1得分上提高了12.4%(0.724-0.814),召回率提高了40.4%(0.675-0.948)。此外,我们还提出了一种混合方法,将biomo - electra - large与生成式llm集成在一起,从而提高了单个模型的性能。我们的混合模型在验证集中比bio - electra - large的F1分数提高了0.8%到18.5%(0.005-0.133),这主要是由于精确度的提高,尽管与原始生成式LLM相比召回率降低了。重要的是,这种方法可以节省大量的计算资源,因为BioM-ELECTRA-Large只选择数据集的一个子集(从19.7%到73.4%不等)进行生成式llm的下游预测。通过利用生成式llm作为召回优化模型的优势,并将其与微调模型相结合,我们可以释放人工智能在预测药物不良反应关系方面的全部潜力,最终提高患者安全。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Unlocking Potential of Generative Large Language Models for Adverse Drug Reaction Relation Prediction in Discharge Summaries: Analysis and Strategy.

We present a comparative analysis of generative large language models (LLMs) for predicting causal relationships between drugs and adverse events found in text segments from discharge summaries. Despite lacking prior training for identifying related drug-adverse event pairs, generative LLMs demonstrate exceptional performance as recall-optimized models, achieving F1 scores comparable to those of fine-tuned models. Notably, on the MIMIC-Unrestricted dataset, Gemini 1.5 Pro and Llama 3.1 405B outperform our in-house fine-tuned BioM-ELECTRA-Large, with Gemini 1.5 Pro showing a 19.2% (0.724-0.863) improvement in F1 score and a 39.7% (0.675-0.943) increase in recall, while Llama 3.1 405B exhibits a 12.4% (0.724-0.814) improvement in F1 and a 40.4% (0.675-0.948) boost in recall. Additionally, we propose a hybrid approach that integrates BioM-ELECTRA-Large with generative LLMs, resulting in enhanced performance over the individual models. Our hybrid model achieves F1 score improvements ranging from 0.8% to 18.5% (0.005-0.133) over BioM-ELECTRA-Large in the validation set, primarily due to increased precision, albeit with a decrease in recall compared with the original generative LLM. Importantly, this approach yields substantial computational resource savings, as BioM-ELECTRA-Large selects only a subset of segments-ranging from 19.7% to 73.4% across our datasets-for downstream prediction by generative LLMs. By harnessing the strengths of generative LLMs as recall-optimized models and combining them with fine-tuned models, we can unlock the full potential of artificial intelligence in predicting adverse drug reaction relations, ultimately enhancing patient safety.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
12.70
自引率
7.50%
发文量
290
审稿时长
2 months
期刊介绍: Clinical Pharmacology & Therapeutics (CPT) is the authoritative cross-disciplinary journal in experimental and clinical medicine devoted to publishing advances in the nature, action, efficacy, and evaluation of therapeutics. CPT welcomes original Articles in the emerging areas of translational, predictive and personalized medicine; new therapeutic modalities including gene and cell therapies; pharmacogenomics, proteomics and metabolomics; bioinformation and applied systems biology complementing areas of pharmacokinetics and pharmacodynamics, human investigation and clinical trials, pharmacovigilence, pharmacoepidemiology, pharmacometrics, and population pharmacology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信