Fountane Chan MD , Wei-Chun Lin MD, PhD , Alan Tang , Benjamin Y. Xu MD, PhD , Sophia Y. Wang MD, MS , Michael V. Boland MD, PhD , Catherine Q. Sun MD , Sally Baxter MD, MSc , Brian Stagg MD, MS , Michelle Hribar PhD , Aiyin Chen MD
{"title":"Development and Evaluation of a Computable Phenotype for Normal Tension Glaucoma","authors":"Fountane Chan MD , Wei-Chun Lin MD, PhD , Alan Tang , Benjamin Y. Xu MD, PhD , Sophia Y. Wang MD, MS , Michael V. Boland MD, PhD , Catherine Q. Sun MD , Sally Baxter MD, MSc , Brian Stagg MD, MS , Michelle Hribar PhD , Aiyin Chen MD","doi":"10.1016/j.xops.2025.100858","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>To develop a computable phenotype for normal tension glaucoma (NTG) to enhance disease identification from electronic health records (EHRs).</div></div><div><h3>Design</h3><div>Retrospective cohort study.</div></div><div><h3>Subjects</h3><div>Deidentified EHR data from an academic medical center identified 1851 patients aged ≥40 years, with glaucoma and available clinical notes.</div></div><div><h3>Methods</h3><div>Of these 1851 patients, 200 were randomly selected for a chart review to receive gold standard diagnoses. Four rule-based NTG computable phenotypes were developed and tested. Phenotype 1 relied on NTG International Classification of Diseases (ICD)-9 and ICD-10 codes. Phenotype 2 incorporated structured intraocular pressure (IOP) data and medication lists. Phenotype 3 used only structured IOP data. Phenotype 4 combined structured IOP and medication data natural language processing (NLP) to extract IOP values and NTG mentions from chart notes. Internal and external validation were performed.</div></div><div><h3>Main Outcome Measures</h3><div>F1 score, sensitivities, specificities, positive predictive value (PPV), negative predictive value (NPV), and accuracy.</div></div><div><h3>Results</h3><div>Chart review identified NTG in 30% of patients, and only 7% had NTG ICD codes. Phenotype 1 had an F1 of 36.8%, sensitivity 24.1%, specificity 97%, PPV 77.8%, NPV 74.9%, and accuracy 75.1%. Compared with ICD codes, phenotypes 2 and 3 had F1 of 66.7% and 69.8%, sensitivity 77.6% and 89.7%, specificity 76.3% and 71.1%, PPV 58.4% and 57.1%, NPV 88.8% and 94.1%, and accuracy of 76.7% and 76.7%, respectively. Incorporating NLP, phenotype 4 had the best performance with an F1 of 77.4%, sensitivity 82.8%, specificity 86.7%, PPV 72.7%, NPV 92.1%, and accuracy 85.5%. Phenotypes 2 to 4 increase NTG case detection fourfold compared with phenotype 1.</div></div><div><h3>Conclusions</h3><div>Normal tension glaucoma phenotypes using NLP achieved the best overall performance, and those incorporating structured data perform better than ICD codes alone. The NTG ICD code-based phenotype is highly specific but lacks sensitivity. Insights from this study may inform the development of computable phenotypes for other disease subtypes within broader disease categories.</div></div><div><h3>Financial Disclosure(s)</h3><div>Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.</div></div>","PeriodicalId":74363,"journal":{"name":"Ophthalmology science","volume":"5 6","pages":"Article 100858"},"PeriodicalIF":4.6000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmology science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666914525001563","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose
To develop a computable phenotype for normal tension glaucoma (NTG) to enhance disease identification from electronic health records (EHRs).
Design
Retrospective cohort study.
Subjects
Deidentified EHR data from an academic medical center identified 1851 patients aged ≥40 years, with glaucoma and available clinical notes.
Methods
Of these 1851 patients, 200 were randomly selected for a chart review to receive gold standard diagnoses. Four rule-based NTG computable phenotypes were developed and tested. Phenotype 1 relied on NTG International Classification of Diseases (ICD)-9 and ICD-10 codes. Phenotype 2 incorporated structured intraocular pressure (IOP) data and medication lists. Phenotype 3 used only structured IOP data. Phenotype 4 combined structured IOP and medication data natural language processing (NLP) to extract IOP values and NTG mentions from chart notes. Internal and external validation were performed.
Main Outcome Measures
F1 score, sensitivities, specificities, positive predictive value (PPV), negative predictive value (NPV), and accuracy.
Results
Chart review identified NTG in 30% of patients, and only 7% had NTG ICD codes. Phenotype 1 had an F1 of 36.8%, sensitivity 24.1%, specificity 97%, PPV 77.8%, NPV 74.9%, and accuracy 75.1%. Compared with ICD codes, phenotypes 2 and 3 had F1 of 66.7% and 69.8%, sensitivity 77.6% and 89.7%, specificity 76.3% and 71.1%, PPV 58.4% and 57.1%, NPV 88.8% and 94.1%, and accuracy of 76.7% and 76.7%, respectively. Incorporating NLP, phenotype 4 had the best performance with an F1 of 77.4%, sensitivity 82.8%, specificity 86.7%, PPV 72.7%, NPV 92.1%, and accuracy 85.5%. Phenotypes 2 to 4 increase NTG case detection fourfold compared with phenotype 1.
Conclusions
Normal tension glaucoma phenotypes using NLP achieved the best overall performance, and those incorporating structured data perform better than ICD codes alone. The NTG ICD code-based phenotype is highly specific but lacks sensitivity. Insights from this study may inform the development of computable phenotypes for other disease subtypes within broader disease categories.
Financial Disclosure(s)
Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.