Bennett J Waxse, Fausto Andres Bustos Carrillo, Tam C Tran, Huan Mo, Emily E Ricotta, Joshua C Denny
{"title":"Computable Phenotypes for Respiratory Viral Infections in the <i>All of Us</i> Research Program.","authors":"Bennett J Waxse, Fausto Andres Bustos Carrillo, Tam C Tran, Huan Mo, Emily E Ricotta, Joshua C Denny","doi":"10.1101/2025.01.17.25320744","DOIUrl":null,"url":null,"abstract":"<p><p>Electronic health records (EHRs) contain rich temporal data about infectious diseases, but an optimal approach to identify infections remains undefined. Using the <i>All of Us</i> Research Program, we developed computable phenotypes for respiratory viruses by integrating billing codes, prescriptions, and laboratory results within 90-day episodes. Phenotypes computed from 265,222 participants yielded cohorts ranging from 238 (adenovirus) to 28,729 (SARS-CoV-2) cases. Virus-specific billing codes showed varied sensitivity (8-67%) and high positive predictive value (90-97%), except for influenza virus and SARS-CoV-2 where lower PPV (69-70%) improved with increasing billing codes. Identified infections exhibited expected seasonal patterns and virus proportions when compared with CDC data. This integrated approach identified episodic disease more effectively than individual components alone and demonstrated utility in identifying severe infections. The method enables large-scale studies of host genetics, health disparities, and clinical outcomes across episodic diseases.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11759596/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2025.01.17.25320744","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Electronic health records (EHRs) contain rich temporal data about infectious diseases, but an optimal approach to identify infections remains undefined. Using the All of Us Research Program, we developed computable phenotypes for respiratory viruses by integrating billing codes, prescriptions, and laboratory results within 90-day episodes. Phenotypes computed from 265,222 participants yielded cohorts ranging from 238 (adenovirus) to 28,729 (SARS-CoV-2) cases. Virus-specific billing codes showed varied sensitivity (8-67%) and high positive predictive value (90-97%), except for influenza virus and SARS-CoV-2 where lower PPV (69-70%) improved with increasing billing codes. Identified infections exhibited expected seasonal patterns and virus proportions when compared with CDC data. This integrated approach identified episodic disease more effectively than individual components alone and demonstrated utility in identifying severe infections. The method enables large-scale studies of host genetics, health disparities, and clinical outcomes across episodic diseases.