Zero-shot medical event prediction using a generative pretrained transformer on electronic health records.

IF 4.6 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association Pub Date : 2025-10-08 DOI:10.1093/jamia/ocaf160

Ekaterina Redekop, Zichen Wang, Rushikesh Kulkarni, Mara Pleasure, Aaron Chin, Hamid Reza Hassanzadeh, Brian L Hill, Melika Emami, William F Speier, Corey W Arnold

{"title":"Zero-shot medical event prediction using a generative pretrained transformer on electronic health records.","authors":"Ekaterina Redekop, Zichen Wang, Rushikesh Kulkarni, Mara Pleasure, Aaron Chin, Hamid Reza Hassanzadeh, Brian L Hill, Melika Emami, William F Speier, Corey W Arnold","doi":"10.1093/jamia/ocaf160","DOIUrl":null,"url":null,"abstract":"Objectives: Longitudinal data in electronic health records (EHRs) represent an individual's clinical history through a sequence of codified concepts, including diagnoses, procedures, medications, and laboratory tests. Generative pretrained transformers (GPT) can leverage this data to predict future events. While fine-tuning of these models can enhance task-specific performance, it becomes costly when applied to many clinical prediction tasks. In contrast, a pretrained foundation model can be used in zero-shot forecasting setting, offering a scalable alternative to fine-tuning separate models for each outcome.Materials and methods: This study presents the first comprehensive analysis of zero-shot forecasting with GPT-based foundational models in EHRs, introducing a novel pipeline that formulates medical concept prediction as a generative modeling task. Unlike supervised approaches requiring extensive labeled data, our method enables the model to forecast the next medical event purely from a pretraining knowledge. We evaluate performance across multiple time horizons and clinical categories, demonstrating model's ability to capture latent temporal dependencies and complex patient trajectories without task supervision.Results: The model's performance in predicting the next medical concept was evaluated using precision and recall metrics, achieving an average top-1 precision of 0.614 and recall of 0.524. For 12 major diagnostic conditions, the model demonstrated strong zero-shot performance, achieving high true positive rates while maintaining low false positives.Discussion: We demonstrate the power of a foundational EHR GPT model in capturing diverse phenotypes and enabling robust, zero-shot forecasting of clinical outcomes. This capability highlights both its versatility across conditions like liver cancer and SLE, and its limitations in more ambiguous settings such as depression, while also revealing meaningful latent clinical structure.Conclusion: This capability enhances the versatility of predictive healthcare models and reduces the need for task-specific training, enabling more scalable applications in clinical settings.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocaf160","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: Longitudinal data in electronic health records (EHRs) represent an individual's clinical history through a sequence of codified concepts, including diagnoses, procedures, medications, and laboratory tests. Generative pretrained transformers (GPT) can leverage this data to predict future events. While fine-tuning of these models can enhance task-specific performance, it becomes costly when applied to many clinical prediction tasks. In contrast, a pretrained foundation model can be used in zero-shot forecasting setting, offering a scalable alternative to fine-tuning separate models for each outcome.

Materials and methods: This study presents the first comprehensive analysis of zero-shot forecasting with GPT-based foundational models in EHRs, introducing a novel pipeline that formulates medical concept prediction as a generative modeling task. Unlike supervised approaches requiring extensive labeled data, our method enables the model to forecast the next medical event purely from a pretraining knowledge. We evaluate performance across multiple time horizons and clinical categories, demonstrating model's ability to capture latent temporal dependencies and complex patient trajectories without task supervision.

Results: The model's performance in predicting the next medical concept was evaluated using precision and recall metrics, achieving an average top-1 precision of 0.614 and recall of 0.524. For 12 major diagnostic conditions, the model demonstrated strong zero-shot performance, achieving high true positive rates while maintaining low false positives.

Discussion: We demonstrate the power of a foundational EHR GPT model in capturing diverse phenotypes and enabling robust, zero-shot forecasting of clinical outcomes. This capability highlights both its versatility across conditions like liver cancer and SLE, and its limitations in more ambiguous settings such as depression, while also revealing meaningful latent clinical structure.

Conclusion: This capability enhances the versatility of predictive healthcare models and reduces the need for task-specific training, enabling more scalable applications in clinical settings.

查看原文本刊更多论文

基于电子健康记录的生成式预训练变压器的零射击医疗事件预测。

目的：电子健康记录（EHRs）中的纵向数据通过一系列编纂的概念代表个人的临床病史，包括诊断、程序、药物和实验室测试。生成式预训练变压器（GPT）可以利用这些数据来预测未来的事件。虽然这些模型的微调可以提高特定任务的性能，但当应用于许多临床预测任务时，它变得昂贵。相比之下，预训练的基础模型可用于零概率预测设置，为每个结果提供可扩展的替代方案，以微调单独的模型。材料和方法：本研究首次全面分析了电子病历中基于gpt的基础模型的零概率预测，引入了一种将医学概念预测作为生成建模任务的新管道。与需要大量标记数据的监督方法不同，我们的方法使模型能够纯粹从预训练知识中预测下一个医疗事件。我们评估了多个时间范围和临床类别的性能，展示了模型在没有任务监督的情况下捕获潜在时间依赖性和复杂患者轨迹的能力。结果：该模型在预测下一个医学概念方面的性能使用精度和召回率指标进行评估，平均前1精度为0.614，召回率为0.524。对于12种主要诊断条件，该模型表现出强大的零射击性能，在保持低假阳性的同时实现高真阳性率。讨论：我们展示了基础EHR GPT模型在捕获不同表型和实现临床结果的稳健、零概率预测方面的能力。这种能力突出了它在肝癌和SLE等疾病中的通用性，以及在抑郁症等更模糊的情况下的局限性，同时也揭示了有意义的潜在临床结构。结论：该功能增强了预测性医疗模型的多功能性，减少了对特定于任务的培训的需求，从而在临床环境中实现了更具可扩展性的应用程序。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of the American Medical Informatics Association 医学-计算机：跨学科应用

CiteScore

14.50

自引率

7.80%

发文量

230

审稿时长

3-8 weeks

期刊介绍： JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.