Samrachana Adhikari, Tyrel Stokes, Xiyue Li, Yunan Zhao, Cassidy Fitchett, Nathalia Ladino, Steven Lawrence, Min Qian, Young S Cho, Carine Hamo, John A Dodson, Rumi Chunara, Ian M Kronish, Amrita Mukhopadhyay, Saul B Blecker
{"title":"基于机器学习的心力衰竭患者药物依从性预测,使用大型电子健康记录队列,并与药房填充和社区数据相关联。","authors":"Samrachana Adhikari, Tyrel Stokes, Xiyue Li, Yunan Zhao, Cassidy Fitchett, Nathalia Ladino, Steven Lawrence, Min Qian, Young S Cho, Carine Hamo, John A Dodson, Rumi Chunara, Ian M Kronish, Amrita Mukhopadhyay, Saul B Blecker","doi":"10.1093/jamia/ocaf162","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>While timely interventions can improve medication adherence, it is challenging to identify which patients are at risk of nonadherence at point-of-care. We aim to develop and validate flexible machine learning (ML) models to predict a continuous measure of adherence to guideline-directed medication therapies (GDMTs) for heart failure (HF).</p><p><strong>Materials and methods: </strong>We utilized a large electronic health record (EHR) cohort of 34,697 HF patients seen at NYU Langone Health with an active prescription for ≥1 GDMT between April 01, 2021 and October 31, 2022. The outcome was adherence to GDMT measured as proportion of days covered (PDC) at 6 months following a clinical encounter. Over 120 predictors included patient-, therapy-, healthcare-, and neighborhood-level factors guided by the World Health Organization's model of barriers to adherence. We compared performance of several ML models and their ensemble (superlearner) for predicting PDC with traditional regression model (OLS) using mean absolute error (MAE) averaged across 10-fold cross-validation, % increase in MAE relative to superlearner, and predictive-difference across deciles of predicted PDC.</p><p><strong>Results: </strong>Superlearner, a flexible nonparametric prediction approach, demonstrated superior prediction performance. Superlearner and quantile random forest had the lowest MAE (mean [95% CI] = 18.9% [18.7%-19.1%] for both), followed by MAEs for quantile neural network (19.5% [19.3%-19.7%]) and kernel support vector regression (19.8% [19.6%-20.0%]). Gradient boosted trees and OLS were the 2 worst performing models with 17% and 14% higher MAEs, respectively, relative to superlearner. Superlearner demonstrated improved predictive difference.</p><p><strong>Conclusion: </strong>This development phase study suggests potential of linked EHR-pharmacy data and ML to identify HF patients who will benefit from medication adherence interventions.</p><p><strong>Discussion: </strong>Fairness evaluation and external validation are needed prior to clinical integration.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning based prediction of medication adherence in heart failure using large electronic health record cohort with linkages to pharmacy-fill and neighborhood-level data.\",\"authors\":\"Samrachana Adhikari, Tyrel Stokes, Xiyue Li, Yunan Zhao, Cassidy Fitchett, Nathalia Ladino, Steven Lawrence, Min Qian, Young S Cho, Carine Hamo, John A Dodson, Rumi Chunara, Ian M Kronish, Amrita Mukhopadhyay, Saul B Blecker\",\"doi\":\"10.1093/jamia/ocaf162\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>While timely interventions can improve medication adherence, it is challenging to identify which patients are at risk of nonadherence at point-of-care. We aim to develop and validate flexible machine learning (ML) models to predict a continuous measure of adherence to guideline-directed medication therapies (GDMTs) for heart failure (HF).</p><p><strong>Materials and methods: </strong>We utilized a large electronic health record (EHR) cohort of 34,697 HF patients seen at NYU Langone Health with an active prescription for ≥1 GDMT between April 01, 2021 and October 31, 2022. The outcome was adherence to GDMT measured as proportion of days covered (PDC) at 6 months following a clinical encounter. Over 120 predictors included patient-, therapy-, healthcare-, and neighborhood-level factors guided by the World Health Organization's model of barriers to adherence. We compared performance of several ML models and their ensemble (superlearner) for predicting PDC with traditional regression model (OLS) using mean absolute error (MAE) averaged across 10-fold cross-validation, % increase in MAE relative to superlearner, and predictive-difference across deciles of predicted PDC.</p><p><strong>Results: </strong>Superlearner, a flexible nonparametric prediction approach, demonstrated superior prediction performance. Superlearner and quantile random forest had the lowest MAE (mean [95% CI] = 18.9% [18.7%-19.1%] for both), followed by MAEs for quantile neural network (19.5% [19.3%-19.7%]) and kernel support vector regression (19.8% [19.6%-20.0%]). Gradient boosted trees and OLS were the 2 worst performing models with 17% and 14% higher MAEs, respectively, relative to superlearner. Superlearner demonstrated improved predictive difference.</p><p><strong>Conclusion: </strong>This development phase study suggests potential of linked EHR-pharmacy data and ML to identify HF patients who will benefit from medication adherence interventions.</p><p><strong>Discussion: </strong>Fairness evaluation and external validation are needed prior to clinical integration.</p>\",\"PeriodicalId\":50016,\"journal\":{\"name\":\"Journal of the American Medical Informatics Association\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American Medical Informatics Association\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://doi.org/10.1093/jamia/ocaf162\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocaf162","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Machine learning based prediction of medication adherence in heart failure using large electronic health record cohort with linkages to pharmacy-fill and neighborhood-level data.
Objective: While timely interventions can improve medication adherence, it is challenging to identify which patients are at risk of nonadherence at point-of-care. We aim to develop and validate flexible machine learning (ML) models to predict a continuous measure of adherence to guideline-directed medication therapies (GDMTs) for heart failure (HF).
Materials and methods: We utilized a large electronic health record (EHR) cohort of 34,697 HF patients seen at NYU Langone Health with an active prescription for ≥1 GDMT between April 01, 2021 and October 31, 2022. The outcome was adherence to GDMT measured as proportion of days covered (PDC) at 6 months following a clinical encounter. Over 120 predictors included patient-, therapy-, healthcare-, and neighborhood-level factors guided by the World Health Organization's model of barriers to adherence. We compared performance of several ML models and their ensemble (superlearner) for predicting PDC with traditional regression model (OLS) using mean absolute error (MAE) averaged across 10-fold cross-validation, % increase in MAE relative to superlearner, and predictive-difference across deciles of predicted PDC.
Results: Superlearner, a flexible nonparametric prediction approach, demonstrated superior prediction performance. Superlearner and quantile random forest had the lowest MAE (mean [95% CI] = 18.9% [18.7%-19.1%] for both), followed by MAEs for quantile neural network (19.5% [19.3%-19.7%]) and kernel support vector regression (19.8% [19.6%-20.0%]). Gradient boosted trees and OLS were the 2 worst performing models with 17% and 14% higher MAEs, respectively, relative to superlearner. Superlearner demonstrated improved predictive difference.
Conclusion: This development phase study suggests potential of linked EHR-pharmacy data and ML to identify HF patients who will benefit from medication adherence interventions.
Discussion: Fairness evaluation and external validation are needed prior to clinical integration.
期刊介绍:
JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.