Benchmarking large language models for replication of guideline-based PGx recommendations.

IF 2.9 3区 医学 Q2 GENETICS & HEREDITY
Mike Zack, Ioan Slobodchikov, Danil Stupichev, Alex Moore, David Sokolov, Igor Trifonov, Allan Gobbs
{"title":"Benchmarking large language models for replication of guideline-based PGx recommendations.","authors":"Mike Zack, Ioan Slobodchikov, Danil Stupichev, Alex Moore, David Sokolov, Igor Trifonov, Allan Gobbs","doi":"10.1038/s41397-025-00383-0","DOIUrl":null,"url":null,"abstract":"<p><p>We evaluated the ability of large language models (LLMs) to generate clinically accurate pharmacogenomic (PGx) recommendations aligned with CPIC guidelines. Using a benchmark of 599 curated gene-drug-phenotype scenarios, we compared five leading models, including GPT-4o and fine-tuned LLaMA variants, through both standard lexical metrics and a novel semantic evaluation framework (LLM Score) validated by expert review. General-purpose models frequently produced incomplete or unsafe outputs, while our domain-adapted model achieved superior performance, with an LLM Score of 0.92 and significantly faster inference speed. Results highlight the importance of fine-tuning and structured prompting over model scale alone. This work establishes a robust framework for evaluating PGx-specific LLMs and demonstrates the feasibility of safer, AI-driven personalized medicine.</p>","PeriodicalId":54624,"journal":{"name":"Pharmacogenomics Journal","volume":"25 4","pages":"23"},"PeriodicalIF":2.9000,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pharmacogenomics Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1038/s41397-025-00383-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

We evaluated the ability of large language models (LLMs) to generate clinically accurate pharmacogenomic (PGx) recommendations aligned with CPIC guidelines. Using a benchmark of 599 curated gene-drug-phenotype scenarios, we compared five leading models, including GPT-4o and fine-tuned LLaMA variants, through both standard lexical metrics and a novel semantic evaluation framework (LLM Score) validated by expert review. General-purpose models frequently produced incomplete or unsafe outputs, while our domain-adapted model achieved superior performance, with an LLM Score of 0.92 and significantly faster inference speed. Results highlight the importance of fine-tuning and structured prompting over model scale alone. This work establishes a robust framework for evaluating PGx-specific LLMs and demonstrates the feasibility of safer, AI-driven personalized medicine.

对大型语言模型进行基准测试,以复制基于指南的PGx建议。
我们评估了大型语言模型(LLMs)生成符合CPIC指南的临床准确药物基因组学(PGx)建议的能力。使用599个精心设计的基因-药物表型场景的基准,我们通过标准的词汇指标和经过专家评审验证的新颖语义评估框架(LLM Score),比较了五种领先的模型,包括gpt - 40和微调的LLaMA变体。通用模型经常产生不完整或不安全的输出,而我们的领域适应模型取得了更好的性能,LLM得分为0.92,推理速度明显加快。结果强调了微调和结构化提示比单独的模型规模的重要性。这项工作为评估pgx特异性llm建立了一个强大的框架,并证明了更安全、人工智能驱动的个性化医疗的可行性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Pharmacogenomics Journal
Pharmacogenomics Journal 医学-药学
CiteScore
7.20
自引率
0.00%
发文量
35
审稿时长
6-12 weeks
期刊介绍: The Pharmacogenomics Journal is a print and electronic journal, which is dedicated to the rapid publication of original research on pharmacogenomics and its clinical applications. Key areas of coverage include: Personalized medicine Effects of genetic variability on drug toxicity and efficacy Identification and functional characterization of polymorphisms relevant to drug action Pharmacodynamic and pharmacokinetic variations and drug efficacy Integration of new developments in the genome project and proteomics into clinical medicine, pharmacology, and therapeutics Clinical applications of genomic science Identification of novel genomic targets for drug development Potential benefits of pharmacogenomics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信