Alicia Chen, Chuan Hong, Yuk Lam Ho, Nicholas Link, Jacqueline P Honerlaw, Vidisha Tanukonda, Ariela R Orkaby, Saadia Qazi, Connor Melley, Ashley Galloway, Lauren Costa, Monika Maripuri, Xuan Wang, Yichi Zhang, Petra Schubert, Tianrun Cai, Zeling He, Vidul A Panickan, Morgan Rosser, Laura Tarko, Sharon Dowell, Candace Feldman, Gail Kerr, J Michael Gaziano, Peter W F Wilson, Kelly Cho, Tianxi Cai, Katherine P Liao
{"title":"Improving classification of myocardial infarction with machine learning in a diverse population.","authors":"Alicia Chen, Chuan Hong, Yuk Lam Ho, Nicholas Link, Jacqueline P Honerlaw, Vidisha Tanukonda, Ariela R Orkaby, Saadia Qazi, Connor Melley, Ashley Galloway, Lauren Costa, Monika Maripuri, Xuan Wang, Yichi Zhang, Petra Schubert, Tianrun Cai, Zeling He, Vidul A Panickan, Morgan Rosser, Laura Tarko, Sharon Dowell, Candace Feldman, Gail Kerr, J Michael Gaziano, Peter W F Wilson, Kelly Cho, Tianxi Cai, Katherine P Liao","doi":"10.1093/aje/kwaf223","DOIUrl":null,"url":null,"abstract":"<p><p>Phenotype classification with electronic health record (EHR) data is increasingly performed with ML, however their performance in diverse populations remains understudied. We compared an ICD-based algorithm with an ML phenotyping pipeline to classify myocardial infarction (MI) in a general and self-reported Black population. We determined the impact of differential performance by replicating a published MI risk factor study with MI defined by the ICD or ML algorithms. Individuals followed in the Veterans Health Administration (VHA) EHR with data from 2002 to 2019 were examined: 11,523,175 Veterans, mean age 67.5 years, 93.8% male, 14.3% Black, 79.1% White. MI was classified using a published rule-based ICD algorithm and an ML pipeline, PheCAP which incorporates natural language processing. Algorithms were trained and validated against n=403 Veterans randomly selected and chart-reviewed for MI (gold standard), oversampled for self-reported Black. Among chart-reviewed Veterans, the ICD algorithm had high PPV and low sensitivity (all race, PPV:0.97, sensitivity:0.17; Black Veterans, PPV:0.94, sensitivity:0.24). PheCAP MI had good PPV and higher sensitivity (all race, PPV:0.90, sensitivity:0.66; Black, PPV:0.81, sensitivity:0.79). Applying PheCAP MI to the entire VHA population to classify MI provided increased power to replicate findings from the published MI risk factor study compared to the ICD algorithm.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/aje/kwaf223","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Phenotype classification with electronic health record (EHR) data is increasingly performed with ML, however their performance in diverse populations remains understudied. We compared an ICD-based algorithm with an ML phenotyping pipeline to classify myocardial infarction (MI) in a general and self-reported Black population. We determined the impact of differential performance by replicating a published MI risk factor study with MI defined by the ICD or ML algorithms. Individuals followed in the Veterans Health Administration (VHA) EHR with data from 2002 to 2019 were examined: 11,523,175 Veterans, mean age 67.5 years, 93.8% male, 14.3% Black, 79.1% White. MI was classified using a published rule-based ICD algorithm and an ML pipeline, PheCAP which incorporates natural language processing. Algorithms were trained and validated against n=403 Veterans randomly selected and chart-reviewed for MI (gold standard), oversampled for self-reported Black. Among chart-reviewed Veterans, the ICD algorithm had high PPV and low sensitivity (all race, PPV:0.97, sensitivity:0.17; Black Veterans, PPV:0.94, sensitivity:0.24). PheCAP MI had good PPV and higher sensitivity (all race, PPV:0.90, sensitivity:0.66; Black, PPV:0.81, sensitivity:0.79). Applying PheCAP MI to the entire VHA population to classify MI provided increased power to replicate findings from the published MI risk factor study compared to the ICD algorithm.
期刊介绍:
The American Journal of Epidemiology is the oldest and one of the premier epidemiologic journals devoted to the publication of empirical research findings, opinion pieces, and methodological developments in the field of epidemiologic research.
It is a peer-reviewed journal aimed at both fellow epidemiologists and those who use epidemiologic data, including public health workers and clinicians.