Machine learning based prediction of medication adherence in heart failure using large electronic health record cohort with linkages to pharmacy-fill and neighborhood-level data.

IF 4.6 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association Pub Date : 2025-10-01 DOI:10.1093/jamia/ocaf162

Samrachana Adhikari, Tyrel Stokes, Xiyue Li, Yunan Zhao, Cassidy Fitchett, Nathalia Ladino, Steven Lawrence, Min Qian, Young S Cho, Carine Hamo, John A Dodson, Rumi Chunara, Ian M Kronish, Amrita Mukhopadhyay, Saul B Blecker

{"title":"Machine learning based prediction of medication adherence in heart failure using large electronic health record cohort with linkages to pharmacy-fill and neighborhood-level data.","authors":"Samrachana Adhikari, Tyrel Stokes, Xiyue Li, Yunan Zhao, Cassidy Fitchett, Nathalia Ladino, Steven Lawrence, Min Qian, Young S Cho, Carine Hamo, John A Dodson, Rumi Chunara, Ian M Kronish, Amrita Mukhopadhyay, Saul B Blecker","doi":"10.1093/jamia/ocaf162","DOIUrl":null,"url":null,"abstract":"Objective: While timely interventions can improve medication adherence, it is challenging to identify which patients are at risk of nonadherence at point-of-care. We aim to develop and validate flexible machine learning (ML) models to predict a continuous measure of adherence to guideline-directed medication therapies (GDMTs) for heart failure (HF).Materials and methods: We utilized a large electronic health record (EHR) cohort of 34,697 HF patients seen at NYU Langone Health with an active prescription for ≥1 GDMT between April 01, 2021 and October 31, 2022. The outcome was adherence to GDMT measured as proportion of days covered (PDC) at 6 months following a clinical encounter. Over 120 predictors included patient-, therapy-, healthcare-, and neighborhood-level factors guided by the World Health Organization's model of barriers to adherence. We compared performance of several ML models and their ensemble (superlearner) for predicting PDC with traditional regression model (OLS) using mean absolute error (MAE) averaged across 10-fold cross-validation, % increase in MAE relative to superlearner, and predictive-difference across deciles of predicted PDC.Results: Superlearner, a flexible nonparametric prediction approach, demonstrated superior prediction performance. Superlearner and quantile random forest had the lowest MAE (mean [95% CI] = 18.9% [18.7%-19.1%] for both), followed by MAEs for quantile neural network (19.5% [19.3%-19.7%]) and kernel support vector regression (19.8% [19.6%-20.0%]). Gradient boosted trees and OLS were the 2 worst performing models with 17% and 14% higher MAEs, respectively, relative to superlearner. Superlearner demonstrated improved predictive difference.Conclusion: This development phase study suggests potential of linked EHR-pharmacy data and ML to identify HF patients who will benefit from medication adherence interventions.Discussion: Fairness evaluation and external validation are needed prior to clinical integration.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocaf162","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: While timely interventions can improve medication adherence, it is challenging to identify which patients are at risk of nonadherence at point-of-care. We aim to develop and validate flexible machine learning (ML) models to predict a continuous measure of adherence to guideline-directed medication therapies (GDMTs) for heart failure (HF).

Materials and methods: We utilized a large electronic health record (EHR) cohort of 34,697 HF patients seen at NYU Langone Health with an active prescription for ≥1 GDMT between April 01, 2021 and October 31, 2022. The outcome was adherence to GDMT measured as proportion of days covered (PDC) at 6 months following a clinical encounter. Over 120 predictors included patient-, therapy-, healthcare-, and neighborhood-level factors guided by the World Health Organization's model of barriers to adherence. We compared performance of several ML models and their ensemble (superlearner) for predicting PDC with traditional regression model (OLS) using mean absolute error (MAE) averaged across 10-fold cross-validation, % increase in MAE relative to superlearner, and predictive-difference across deciles of predicted PDC.

Results: Superlearner, a flexible nonparametric prediction approach, demonstrated superior prediction performance. Superlearner and quantile random forest had the lowest MAE (mean [95% CI] = 18.9% [18.7%-19.1%] for both), followed by MAEs for quantile neural network (19.5% [19.3%-19.7%]) and kernel support vector regression (19.8% [19.6%-20.0%]). Gradient boosted trees and OLS were the 2 worst performing models with 17% and 14% higher MAEs, respectively, relative to superlearner. Superlearner demonstrated improved predictive difference.

Conclusion: This development phase study suggests potential of linked EHR-pharmacy data and ML to identify HF patients who will benefit from medication adherence interventions.

Discussion: Fairness evaluation and external validation are needed prior to clinical integration.

查看原文本刊更多论文

基于机器学习的心力衰竭患者药物依从性预测，使用大型电子健康记录队列，并与药房填充和社区数据相关联。

目的：虽然及时干预可以提高药物依从性，但确定哪些患者在护理点有不依从性的风险是具有挑战性的。我们的目标是开发和验证灵活的机器学习（ML）模型，以预测心力衰竭（HF）的指南导向药物治疗（gdmt）的持续依从性。材料和方法：我们使用了一个大型电子健康记录（EHR）队列，包括34,697名心衰患者，这些患者在2021年4月1日至2022年10月31日期间在NYU Langone健康中心就诊，有效处方≥1gdmt。结果是在临床接触后6个月以覆盖天数（PDC）的比例衡量GDMT的依从性。超过120个预测因素包括患者、治疗、医疗保健和社区层面的因素，这些因素由世界卫生组织的依从性障碍模型指导。我们比较了几种ML模型及其集合（超级学习器）与传统回归模型（OLS）预测PDC的性能，使用10倍交叉验证的平均绝对误差（MAE），相对于超级学习器的MAE增加%，以及预测PDC的十分位数的预测差异。结果：Superlearner是一种灵活的非参数预测方法，具有较好的预测性能。超级学习器和分位数随机森林的MAE最低（两者的平均值[95% CI] = 18.9%[18.7%-19.1%]），其次是分位数神经网络的MAE（19.5%[19.3%-19.7%]）和核支持向量回归（19.8%[19.6%-20.0%]）。相对于超级学习器，梯度增强树和OLS是表现最差的两个模型，MAEs分别高出17%和14%。超级学习者表现出改进的预测差异。结论：这项发展阶段的研究表明，将ehr药房数据和ML相关联，可以识别将从药物依从性干预中受益的心衰患者。讨论：临床整合前需要进行公平性评估和外部验证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of the American Medical Informatics Association 医学-计算机：跨学科应用

CiteScore

14.50

自引率

7.80%

发文量

230

审稿时长

3-8 weeks

期刊介绍： JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.