人工智能模型在美国国家队列中使用自我报告的健康数据来识别青光眼高风险患者

IF 3.2 Q1 OPHTHALMOLOGY

Ophthalmology science Pub Date : 2024-12-17 DOI:10.1016/j.xops.2024.100685

Rohith Ravindranath MS, Joel Naor MD, MS, Sophia Y. Wang MD, MS

{"title":"人工智能模型在美国国家队列中使用自我报告的健康数据来识别青光眼高风险患者","authors":"Rohith Ravindranath MS, Joel Naor MD, MS, Sophia Y. Wang MD, MS","doi":"10.1016/j.xops.2024.100685","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose:</h3><div>Early glaucoma detection is key to preventing vision loss, but screening often requires specialized eye examination or photography, limiting large-scale implementation. This study sought to develop artificial intelligence models that use self-reported health data from surveys to prescreen patients at high risk for glaucoma who are most in need of glaucoma screening with ophthalmic examination and imaging.</div></div><div><h3>Design:</h3><div>Cohort study.</div></div><div><h3>Participants:</h3><div>Participants enrolled from May 1, 2018, to July 1, 2022, in the nationwide All of Us Research Program who were ≥18 years of age, had ≥2 eye-related diagnoses in their electronic health record (EHR), and submitted surveys with self-reported health history.</div></div><div><h3>Methods:</h3><div>We developed models to predict the risk of glaucoma, as determined by EHR diagnosis codes, using 3 machine learning approaches: (1) penalized logistic regression, (2) XGBoost, and (3) a fully connected neural network. Glaucoma diagnosis was identified based on International Classification of Diseases codes extracted from EHR data. An 80/20 train–test split was implemented, with cross-validation employed for hyperparameter tuning. Input features included self-reported demographics, general health, lifestyle factors, and family and personal medical history.</div></div><div><h3>Main Outcome Measures:</h3><div>Models were evaluated using standard classification metrics, including area under the receiver operating characteristic curve (AUROC).</div></div><div><h3>Results:</h3><div>Among the 8205 patients, 873 (10.64%) were diagnosed with glaucoma. Across models, AUROC scores for identifying which patients had glaucoma from survey health data ranged from 0.710 to 0.890. XGBoost achieved the highest AUROC of 0.890 (95% confidence interval [CI]: 0.860–0.910). Logistic regression followed with an AUROC of 0.772 (95% CI: 0.753–0.795). Explainability studies revealed that key features included traditionally recognized risk factors for glaucoma, such as age, type 2 diabetes, and a family history of glaucoma.</div></div><div><h3>Conclusions:</h3><div>Machine and deep learning models successfully utilized health data from self-reported surveys to predict glaucoma diagnosis without additional data from ophthalmic imaging or eye examination. These models may eventually enable prescreening for glaucoma in a wide variety of low-resource settings, after which high-risk patients can be referred for targeted screening using more specialized ophthalmic examination or imaging.</div></div><div><h3>Financial Disclosure(s):</h3><div>The author(s) have no proprietary or commercial interest in any materials discussed in this article.</div></div>","PeriodicalId":74363,"journal":{"name":"Ophthalmology science","volume":"5 3","pages":"Article 100685"},"PeriodicalIF":3.2000,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Artificial Intelligence Models to Identify Patients at High Risk for Glaucoma Using Self-reported Health Data in a United States National Cohort\",\"authors\":\"Rohith Ravindranath MS, Joel Naor MD, MS, Sophia Y. Wang MD, MS\",\"doi\":\"10.1016/j.xops.2024.100685\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Purpose:</h3><div>Early glaucoma detection is key to preventing vision loss, but screening often requires specialized eye examination or photography, limiting large-scale implementation. This study sought to develop artificial intelligence models that use self-reported health data from surveys to prescreen patients at high risk for glaucoma who are most in need of glaucoma screening with ophthalmic examination and imaging.</div></div><div><h3>Design:</h3><div>Cohort study.</div></div><div><h3>Participants:</h3><div>Participants enrolled from May 1, 2018, to July 1, 2022, in the nationwide All of Us Research Program who were ≥18 years of age, had ≥2 eye-related diagnoses in their electronic health record (EHR), and submitted surveys with self-reported health history.</div></div><div><h3>Methods:</h3><div>We developed models to predict the risk of glaucoma, as determined by EHR diagnosis codes, using 3 machine learning approaches: (1) penalized logistic regression, (2) XGBoost, and (3) a fully connected neural network. Glaucoma diagnosis was identified based on International Classification of Diseases codes extracted from EHR data. An 80/20 train–test split was implemented, with cross-validation employed for hyperparameter tuning. Input features included self-reported demographics, general health, lifestyle factors, and family and personal medical history.</div></div><div><h3>Main Outcome Measures:</h3><div>Models were evaluated using standard classification metrics, including area under the receiver operating characteristic curve (AUROC).</div></div><div><h3>Results:</h3><div>Among the 8205 patients, 873 (10.64%) were diagnosed with glaucoma. Across models, AUROC scores for identifying which patients had glaucoma from survey health data ranged from 0.710 to 0.890. XGBoost achieved the highest AUROC of 0.890 (95% confidence interval [CI]: 0.860–0.910). Logistic regression followed with an AUROC of 0.772 (95% CI: 0.753–0.795). Explainability studies revealed that key features included traditionally recognized risk factors for glaucoma, such as age, type 2 diabetes, and a family history of glaucoma.</div></div><div><h3>Conclusions:</h3><div>Machine and deep learning models successfully utilized health data from self-reported surveys to predict glaucoma diagnosis without additional data from ophthalmic imaging or eye examination. These models may eventually enable prescreening for glaucoma in a wide variety of low-resource settings, after which high-risk patients can be referred for targeted screening using more specialized ophthalmic examination or imaging.</div></div><div><h3>Financial Disclosure(s):</h3><div>The author(s) have no proprietary or commercial interest in any materials discussed in this article.</div></div>\",\"PeriodicalId\":74363,\"journal\":{\"name\":\"Ophthalmology science\",\"volume\":\"5 3\",\"pages\":\"Article 100685\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ophthalmology science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666914524002215\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmology science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666914524002215","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

目的：早期青光眼检测是预防视力下降的关键，但筛查往往需要专门的眼科检查或摄影，限制了大规模实施。本研究旨在开发人工智能模型，利用调查中自我报告的健康数据对青光眼高风险患者进行预筛查，这些患者最需要通过眼科检查和成像进行青光眼筛查。设计:队列研究。参与者：参与者于2018年5月1日至2022年7月1日在全国范围内的All of Us研究计划中注册，年龄≥18岁，电子健康记录（EHR）中有≥2项眼部相关诊断，并提交了带有自我报告健康史的调查。方法：我们利用3种机器学习方法(1)惩罚逻辑回归，(2)XGBoost和(3)全连接神经网络，建立模型来预测青光眼的风险，由EHR诊断代码确定。根据从电子病历数据中提取的国际疾病分类代码确定青光眼诊断。采用80/20训练测试分割，并采用交叉验证进行超参数调优。输入特征包括自我报告的人口统计、一般健康状况、生活方式因素、家庭和个人病史。主要观察指标：采用标准分类指标对模型进行评价，包括受试者工作特征曲线下面积（AUROC）。结果：8205例患者中，873例（10.64%）确诊为青光眼。在所有模型中，从调查健康数据中识别青光眼患者的AUROC评分范围为0.710至0.890。XGBoost的AUROC最高，为0.890（95%置信区间[CI]: 0.860-0.910）。Logistic回归的AUROC为0.772 （95% CI: 0.753-0.795）。可解释性研究表明，关键特征包括传统上公认的青光眼危险因素，如年龄、2型糖尿病和青光眼家族史。结论：机器和深度学习模型成功地利用来自自我报告调查的健康数据来预测青光眼诊断，而无需来自眼科成像或眼科检查的额外数据。这些模型最终可以在各种低资源环境中进行青光眼的预筛查，之后高危患者可以通过更专业的眼科检查或成像进行针对性筛查。财务披露：作者在本文中讨论的任何材料中没有专有或商业利益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Artificial Intelligence Models to Identify Patients at High Risk for Glaucoma Using Self-reported Health Data in a United States National Cohort

Purpose:

Early glaucoma detection is key to preventing vision loss, but screening often requires specialized eye examination or photography, limiting large-scale implementation. This study sought to develop artificial intelligence models that use self-reported health data from surveys to prescreen patients at high risk for glaucoma who are most in need of glaucoma screening with ophthalmic examination and imaging.

Design:

Cohort study.

Participants:

Participants enrolled from May 1, 2018, to July 1, 2022, in the nationwide All of Us Research Program who were ≥18 years of age, had ≥2 eye-related diagnoses in their electronic health record (EHR), and submitted surveys with self-reported health history.

Methods:

We developed models to predict the risk of glaucoma, as determined by EHR diagnosis codes, using 3 machine learning approaches: (1) penalized logistic regression, (2) XGBoost, and (3) a fully connected neural network. Glaucoma diagnosis was identified based on International Classification of Diseases codes extracted from EHR data. An 80/20 train–test split was implemented, with cross-validation employed for hyperparameter tuning. Input features included self-reported demographics, general health, lifestyle factors, and family and personal medical history.

Main Outcome Measures:

Models were evaluated using standard classification metrics, including area under the receiver operating characteristic curve (AUROC).

Results:

Among the 8205 patients, 873 (10.64%) were diagnosed with glaucoma. Across models, AUROC scores for identifying which patients had glaucoma from survey health data ranged from 0.710 to 0.890. XGBoost achieved the highest AUROC of 0.890 (95% confidence interval [CI]: 0.860–0.910). Logistic regression followed with an AUROC of 0.772 (95% CI: 0.753–0.795). Explainability studies revealed that key features included traditionally recognized risk factors for glaucoma, such as age, type 2 diabetes, and a family history of glaucoma.

Conclusions:

Machine and deep learning models successfully utilized health data from self-reported surveys to predict glaucoma diagnosis without additional data from ophthalmic imaging or eye examination. These models may eventually enable prescreening for glaucoma in a wide variety of low-resource settings, after which high-risk patients can be referred for targeted screening using more specialized ophthalmic examination or imaging.