Artificial Intelligence Models to Identify Patients at High Risk for Glaucoma Using Self-reported Health Data in a United States National Cohort

IF 3.2 Q1 OPHTHALMOLOGY

Ophthalmology science Pub Date : 2024-12-17 DOI:10.1016/j.xops.2024.100685

Rohith Ravindranath MS, Joel Naor MD, MS, Sophia Y. Wang MD, MS

{"title":"Artificial Intelligence Models to Identify Patients at High Risk for Glaucoma Using Self-reported Health Data in a United States National Cohort","authors":"Rohith Ravindranath MS, Joel Naor MD, MS, Sophia Y. Wang MD, MS","doi":"10.1016/j.xops.2024.100685","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose:</h3><div>Early glaucoma detection is key to preventing vision loss, but screening often requires specialized eye examination or photography, limiting large-scale implementation. This study sought to develop artificial intelligence models that use self-reported health data from surveys to prescreen patients at high risk for glaucoma who are most in need of glaucoma screening with ophthalmic examination and imaging.</div></div><div><h3>Design:</h3><div>Cohort study.</div></div><div><h3>Participants:</h3><div>Participants enrolled from May 1, 2018, to July 1, 2022, in the nationwide All of Us Research Program who were ≥18 years of age, had ≥2 eye-related diagnoses in their electronic health record (EHR), and submitted surveys with self-reported health history.</div></div><div><h3>Methods:</h3><div>We developed models to predict the risk of glaucoma, as determined by EHR diagnosis codes, using 3 machine learning approaches: (1) penalized logistic regression, (2) XGBoost, and (3) a fully connected neural network. Glaucoma diagnosis was identified based on International Classification of Diseases codes extracted from EHR data. An 80/20 train–test split was implemented, with cross-validation employed for hyperparameter tuning. Input features included self-reported demographics, general health, lifestyle factors, and family and personal medical history.</div></div><div><h3>Main Outcome Measures:</h3><div>Models were evaluated using standard classification metrics, including area under the receiver operating characteristic curve (AUROC).</div></div><div><h3>Results:</h3><div>Among the 8205 patients, 873 (10.64%) were diagnosed with glaucoma. Across models, AUROC scores for identifying which patients had glaucoma from survey health data ranged from 0.710 to 0.890. XGBoost achieved the highest AUROC of 0.890 (95% confidence interval [CI]: 0.860–0.910). Logistic regression followed with an AUROC of 0.772 (95% CI: 0.753–0.795). Explainability studies revealed that key features included traditionally recognized risk factors for glaucoma, such as age, type 2 diabetes, and a family history of glaucoma.</div></div><div><h3>Conclusions:</h3><div>Machine and deep learning models successfully utilized health data from self-reported surveys to predict glaucoma diagnosis without additional data from ophthalmic imaging or eye examination. These models may eventually enable prescreening for glaucoma in a wide variety of low-resource settings, after which high-risk patients can be referred for targeted screening using more specialized ophthalmic examination or imaging.</div></div><div><h3>Financial Disclosure(s):</h3><div>The author(s) have no proprietary or commercial interest in any materials discussed in this article.</div></div>","PeriodicalId":74363,"journal":{"name":"Ophthalmology science","volume":"5 3","pages":"Article 100685"},"PeriodicalIF":3.2000,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmology science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666914524002215","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose:

Early glaucoma detection is key to preventing vision loss, but screening often requires specialized eye examination or photography, limiting large-scale implementation. This study sought to develop artificial intelligence models that use self-reported health data from surveys to prescreen patients at high risk for glaucoma who are most in need of glaucoma screening with ophthalmic examination and imaging.

Design:

Cohort study.

Participants:

Participants enrolled from May 1, 2018, to July 1, 2022, in the nationwide All of Us Research Program who were ≥18 years of age, had ≥2 eye-related diagnoses in their electronic health record (EHR), and submitted surveys with self-reported health history.

Methods:

We developed models to predict the risk of glaucoma, as determined by EHR diagnosis codes, using 3 machine learning approaches: (1) penalized logistic regression, (2) XGBoost, and (3) a fully connected neural network. Glaucoma diagnosis was identified based on International Classification of Diseases codes extracted from EHR data. An 80/20 train–test split was implemented, with cross-validation employed for hyperparameter tuning. Input features included self-reported demographics, general health, lifestyle factors, and family and personal medical history.

Main Outcome Measures:

Models were evaluated using standard classification metrics, including area under the receiver operating characteristic curve (AUROC).

Results:

Among the 8205 patients, 873 (10.64%) were diagnosed with glaucoma. Across models, AUROC scores for identifying which patients had glaucoma from survey health data ranged from 0.710 to 0.890. XGBoost achieved the highest AUROC of 0.890 (95% confidence interval [CI]: 0.860–0.910). Logistic regression followed with an AUROC of 0.772 (95% CI: 0.753–0.795). Explainability studies revealed that key features included traditionally recognized risk factors for glaucoma, such as age, type 2 diabetes, and a family history of glaucoma.

Conclusions:

Machine and deep learning models successfully utilized health data from self-reported surveys to predict glaucoma diagnosis without additional data from ophthalmic imaging or eye examination. These models may eventually enable prescreening for glaucoma in a wide variety of low-resource settings, after which high-risk patients can be referred for targeted screening using more specialized ophthalmic examination or imaging.