Chenxi Gao , Yunwen Xu , Sneha Mehta , Yingying Sang , Carina Flaherty , Aditya Surapaneni , Krutika Pandit , Alexander R. Chang , Jamie A. Green , Morgan E. Grams , Jung-Im Shin
{"title":"Validation of an Algorithm to Identify End-Stage Kidney Disease in Electronic Health Records Data","authors":"Chenxi Gao , Yunwen Xu , Sneha Mehta , Yingying Sang , Carina Flaherty , Aditya Surapaneni , Krutika Pandit , Alexander R. Chang , Jamie A. Green , Morgan E. Grams , Jung-Im Shin","doi":"10.1053/j.ajkd.2025.03.021","DOIUrl":null,"url":null,"abstract":"<div><h3>Rationale & Objectives</h3><div>Accurate ascertainment of end-stage kidney disease (ESKD) in electronic health records (EHRs) data is important for much epidemiological research. This study developed and validated an algorithm using diagnosis and procedure codes to identify patients with ESKD (treated with maintenance dialysis or kidney transplantation) in EHR data.</div></div><div><h3>Study Design</h3><div>Study of diagnostic algorithms.</div></div><div><h3>Setting & Participants</h3><div>The development cohort included 559,615 patients treated at the Geisinger Health System (January 1996-June 2018). The validation cohort included 767,186 patients treated at New York University Langone Health System (January 2018 to December 2020).</div></div><div><h3>Algorithms Compared</h3><div>The algorithm used diagnosis and procedure codes compared with a nominal gold standard designation within the United States Renal Data System (USRDS) data. The performance of the algorithm was characterized by sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). The dates of incident ESKD between the algorithm and USRDS were compared in a subset of cases.</div></div><div><h3>Outcome</h3><div>ESKD (maintenance dialysis, prior recipient of a kidney transplant, or kidney transplantation surgery) cases.</div></div><div><h3>Results</h3><div>In Geisinger, we developed an ESKD algorithm that identified 4,766 (0.85%) ESKD cases; there were 5,155 (0.92%) ESKD cases reported by the USRDS. The sensitivity, specificity, PPV, and NPV of the algorithm were 73.9% (95% CI, 72.7-75.1%), 99.83% (99.82-99.84%), 79.9% (78.9-81.0%), and 99.76% (99.75-99.77%), respectively. When applying the algorithm to New York University Langone Health System data, the sensitivity, specificity, PPV, and NPV were 71.8% (95% CI, 70.7-73.0%), 99.95% (99.95-99.96%), 91.6% (90.8-92.4%), and 99.79 (99.78-99.80%), respectively. The median difference between dates of incident ESKD (algorithms minus USRDS) was<!--> <!-->−3 (IQR, −21 to 83) days for Geisinger and 0 (IQR, −12 to 69) days for New York University Langone Health.</div></div><div><h3>Limitations</h3><div>Use of structured EHRs data only.</div></div><div><h3>Conclusions</h3><div>Algorithms combining diagnosis and procedure codes show high specificity and modest sensitivity for identifying patients with ESKD, providing a research tool to inform future EHRs-based studies.</div></div><div><h3>Plain-Language Summary</h3><div>Although electronic health records (EHRs) data holds great promise for advancing kidney research, little work has been done to accurately identify ESKD cases in these data. This study developed and validated an algorithm using diagnosis and procedure codes to identify ESKD in EHRs. Our findings showed that the algorithm performed consistently in 2 different health systems, demonstrating high specificity and negative predictive values but lower sensitivity and positive predictive value. This algorithm may inform future ESKD research using EHR data.</div></div>","PeriodicalId":7419,"journal":{"name":"American Journal of Kidney Diseases","volume":"86 2","pages":"Pages 212-221.e1"},"PeriodicalIF":8.2000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Kidney Diseases","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0272638625008625","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"UROLOGY & NEPHROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Rationale & Objectives
Accurate ascertainment of end-stage kidney disease (ESKD) in electronic health records (EHRs) data is important for much epidemiological research. This study developed and validated an algorithm using diagnosis and procedure codes to identify patients with ESKD (treated with maintenance dialysis or kidney transplantation) in EHR data.
Study Design
Study of diagnostic algorithms.
Setting & Participants
The development cohort included 559,615 patients treated at the Geisinger Health System (January 1996-June 2018). The validation cohort included 767,186 patients treated at New York University Langone Health System (January 2018 to December 2020).
Algorithms Compared
The algorithm used diagnosis and procedure codes compared with a nominal gold standard designation within the United States Renal Data System (USRDS) data. The performance of the algorithm was characterized by sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). The dates of incident ESKD between the algorithm and USRDS were compared in a subset of cases.
Outcome
ESKD (maintenance dialysis, prior recipient of a kidney transplant, or kidney transplantation surgery) cases.
Results
In Geisinger, we developed an ESKD algorithm that identified 4,766 (0.85%) ESKD cases; there were 5,155 (0.92%) ESKD cases reported by the USRDS. The sensitivity, specificity, PPV, and NPV of the algorithm were 73.9% (95% CI, 72.7-75.1%), 99.83% (99.82-99.84%), 79.9% (78.9-81.0%), and 99.76% (99.75-99.77%), respectively. When applying the algorithm to New York University Langone Health System data, the sensitivity, specificity, PPV, and NPV were 71.8% (95% CI, 70.7-73.0%), 99.95% (99.95-99.96%), 91.6% (90.8-92.4%), and 99.79 (99.78-99.80%), respectively. The median difference between dates of incident ESKD (algorithms minus USRDS) was −3 (IQR, −21 to 83) days for Geisinger and 0 (IQR, −12 to 69) days for New York University Langone Health.
Limitations
Use of structured EHRs data only.
Conclusions
Algorithms combining diagnosis and procedure codes show high specificity and modest sensitivity for identifying patients with ESKD, providing a research tool to inform future EHRs-based studies.
Plain-Language Summary
Although electronic health records (EHRs) data holds great promise for advancing kidney research, little work has been done to accurately identify ESKD cases in these data. This study developed and validated an algorithm using diagnosis and procedure codes to identify ESKD in EHRs. Our findings showed that the algorithm performed consistently in 2 different health systems, demonstrating high specificity and negative predictive values but lower sensitivity and positive predictive value. This algorithm may inform future ESKD research using EHR data.
期刊介绍:
The American Journal of Kidney Diseases (AJKD), the National Kidney Foundation's official journal, is globally recognized for its leadership in clinical nephrology content. Monthly, AJKD publishes original investigations on kidney diseases, hypertension, dialysis therapies, and kidney transplantation. Rigorous peer-review, statistical scrutiny, and a structured format characterize the publication process. Each issue includes case reports unveiling new diseases and potential therapeutic strategies.