Van Nguyen MD , Sreenidhi Iyengar , Haroon Rasheed MD , Galo Apolo , Zhiwei Li , Aniket Kumar , Hong Nguyen , Austin Bohner MD , Kyle Bolo MD , Rahul Dhodapkar MD , Jiun Do MD, PhD , Andrew T. Duong MD , Jeffrey Gluckstein MD , Kendra Hong MD , Lucas L. Humayun , Alanna James MD , Junhui Lee MD , Kent Nguyen OD , Brandon J. Wong MD , Jose-Luis Ambite PhD , Benjamin Y. Xu MD, PhD
{"title":"比较深度学习和临床医生从安全网人群眼底照片中检测可参考青光眼的表现","authors":"Van Nguyen MD , Sreenidhi Iyengar , Haroon Rasheed MD , Galo Apolo , Zhiwei Li , Aniket Kumar , Hong Nguyen , Austin Bohner MD , Kyle Bolo MD , Rahul Dhodapkar MD , Jiun Do MD, PhD , Andrew T. Duong MD , Jeffrey Gluckstein MD , Kendra Hong MD , Lucas L. Humayun , Alanna James MD , Junhui Lee MD , Kent Nguyen OD , Brandon J. Wong MD , Jose-Luis Ambite PhD , Benjamin Y. Xu MD, PhD","doi":"10.1016/j.xops.2025.100751","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>Develop and test a deep learning (DL) algorithm for detecting referable glaucoma.</div></div><div><h3>Design</h3><div>Retrospective cohort study.</div></div><div><h3>Participants</h3><div>A total of 6116 patients from the Los Angeles County (LAC) Department of Health Services (DHS) were included.</div></div><div><h3>Methods</h3><div>Fundus photographs and patient-level labels of referable glaucoma (cup-to-disc ratio ≥0.6) provided by 21 certified optometrists. A DL algorithm based on the Visual Geometry Group-19 architecture was trained using patient-level labels generalized to images from both eyes. Area under the receiver operating curve (AUROC), sensitivity, and specificity were calculated to assess algorithm performance using an independent test set that was also graded by 13 clinicians with 0 to 10 years of experience. Algorithm performance was tested using reference labels provided by either LAC DHS optometrists or an expert panel of 3 glaucoma specialists.</div></div><div><h3>Main Outcome Measures</h3><div>Area under the receiver operating curve, sensitivity, and specificity.</div></div><div><h3>Results</h3><div>The DL algorithm was trained using 12 998 images from 5616 patients (2086 referable glaucoma, 3530 nonglaucoma). In this data set, the mean age was 56.8 ± 10.5 years with 54.8% women, 68.2% Latinos, 8.9% Blacks, 6.0% Asians, and 2.7% Whites. One thousand images from 500 patients (250 referable glaucoma, 250 nonglaucoma) with similar demographics (<em>P</em> ≥ 0.57) were used to test the algorithm. Algorithm performance matched or exceeded that of all independent clinician graders in detecting patient-level referable glaucoma based on LAC DHS optometrist (AUROC = 0.92) or expert panel (AUROC = 0.93) reference labels. Clinician grader sensitivity (range, 0.33–0.99) and specificity (range, 0.68–0.98) ranged widely and did not correlate with years of experience (<em>P</em>≥ 0.49). Algorithm performance (AUROC = 0.93) also matched or exceeded the sensitivity (range, 0.78–1.00) and specificity (range, 0.32–0.87) of 6 certified LAC DHS optometrists in the subsets of the test data set they graded.</div></div><div><h3>Conclusions</h3><div>A DL algorithm for detecting referable glaucoma trained using patient-level data provided by certified LAC DHS optometrists approximates or exceeds performance by ophthalmologists and optometrists, who exhibit variable sensitivity and specificity unrelated to experience level. Implementation of this algorithm in screening workflows could help reallocate resources and provide more reproducible and timely glaucoma care.</div></div><div><h3>Financial Disclosure(s)</h3><div>Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.</div></div>","PeriodicalId":74363,"journal":{"name":"Ophthalmology science","volume":"5 4","pages":"Article 100751"},"PeriodicalIF":3.2000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparison of Deep Learning and Clinician Performance for Detecting Referable Glaucoma from Fundus Photographs in a Safety Net Population\",\"authors\":\"Van Nguyen MD , Sreenidhi Iyengar , Haroon Rasheed MD , Galo Apolo , Zhiwei Li , Aniket Kumar , Hong Nguyen , Austin Bohner MD , Kyle Bolo MD , Rahul Dhodapkar MD , Jiun Do MD, PhD , Andrew T. Duong MD , Jeffrey Gluckstein MD , Kendra Hong MD , Lucas L. Humayun , Alanna James MD , Junhui Lee MD , Kent Nguyen OD , Brandon J. Wong MD , Jose-Luis Ambite PhD , Benjamin Y. Xu MD, PhD\",\"doi\":\"10.1016/j.xops.2025.100751\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Purpose</h3><div>Develop and test a deep learning (DL) algorithm for detecting referable glaucoma.</div></div><div><h3>Design</h3><div>Retrospective cohort study.</div></div><div><h3>Participants</h3><div>A total of 6116 patients from the Los Angeles County (LAC) Department of Health Services (DHS) were included.</div></div><div><h3>Methods</h3><div>Fundus photographs and patient-level labels of referable glaucoma (cup-to-disc ratio ≥0.6) provided by 21 certified optometrists. A DL algorithm based on the Visual Geometry Group-19 architecture was trained using patient-level labels generalized to images from both eyes. Area under the receiver operating curve (AUROC), sensitivity, and specificity were calculated to assess algorithm performance using an independent test set that was also graded by 13 clinicians with 0 to 10 years of experience. Algorithm performance was tested using reference labels provided by either LAC DHS optometrists or an expert panel of 3 glaucoma specialists.</div></div><div><h3>Main Outcome Measures</h3><div>Area under the receiver operating curve, sensitivity, and specificity.</div></div><div><h3>Results</h3><div>The DL algorithm was trained using 12 998 images from 5616 patients (2086 referable glaucoma, 3530 nonglaucoma). In this data set, the mean age was 56.8 ± 10.5 years with 54.8% women, 68.2% Latinos, 8.9% Blacks, 6.0% Asians, and 2.7% Whites. One thousand images from 500 patients (250 referable glaucoma, 250 nonglaucoma) with similar demographics (<em>P</em> ≥ 0.57) were used to test the algorithm. Algorithm performance matched or exceeded that of all independent clinician graders in detecting patient-level referable glaucoma based on LAC DHS optometrist (AUROC = 0.92) or expert panel (AUROC = 0.93) reference labels. Clinician grader sensitivity (range, 0.33–0.99) and specificity (range, 0.68–0.98) ranged widely and did not correlate with years of experience (<em>P</em>≥ 0.49). Algorithm performance (AUROC = 0.93) also matched or exceeded the sensitivity (range, 0.78–1.00) and specificity (range, 0.32–0.87) of 6 certified LAC DHS optometrists in the subsets of the test data set they graded.</div></div><div><h3>Conclusions</h3><div>A DL algorithm for detecting referable glaucoma trained using patient-level data provided by certified LAC DHS optometrists approximates or exceeds performance by ophthalmologists and optometrists, who exhibit variable sensitivity and specificity unrelated to experience level. Implementation of this algorithm in screening workflows could help reallocate resources and provide more reproducible and timely glaucoma care.</div></div><div><h3>Financial Disclosure(s)</h3><div>Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.</div></div>\",\"PeriodicalId\":74363,\"journal\":{\"name\":\"Ophthalmology science\",\"volume\":\"5 4\",\"pages\":\"Article 100751\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-02-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ophthalmology science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666914525000491\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmology science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666914525000491","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
Comparison of Deep Learning and Clinician Performance for Detecting Referable Glaucoma from Fundus Photographs in a Safety Net Population
Purpose
Develop and test a deep learning (DL) algorithm for detecting referable glaucoma.
Design
Retrospective cohort study.
Participants
A total of 6116 patients from the Los Angeles County (LAC) Department of Health Services (DHS) were included.
Methods
Fundus photographs and patient-level labels of referable glaucoma (cup-to-disc ratio ≥0.6) provided by 21 certified optometrists. A DL algorithm based on the Visual Geometry Group-19 architecture was trained using patient-level labels generalized to images from both eyes. Area under the receiver operating curve (AUROC), sensitivity, and specificity were calculated to assess algorithm performance using an independent test set that was also graded by 13 clinicians with 0 to 10 years of experience. Algorithm performance was tested using reference labels provided by either LAC DHS optometrists or an expert panel of 3 glaucoma specialists.
Main Outcome Measures
Area under the receiver operating curve, sensitivity, and specificity.
Results
The DL algorithm was trained using 12 998 images from 5616 patients (2086 referable glaucoma, 3530 nonglaucoma). In this data set, the mean age was 56.8 ± 10.5 years with 54.8% women, 68.2% Latinos, 8.9% Blacks, 6.0% Asians, and 2.7% Whites. One thousand images from 500 patients (250 referable glaucoma, 250 nonglaucoma) with similar demographics (P ≥ 0.57) were used to test the algorithm. Algorithm performance matched or exceeded that of all independent clinician graders in detecting patient-level referable glaucoma based on LAC DHS optometrist (AUROC = 0.92) or expert panel (AUROC = 0.93) reference labels. Clinician grader sensitivity (range, 0.33–0.99) and specificity (range, 0.68–0.98) ranged widely and did not correlate with years of experience (P≥ 0.49). Algorithm performance (AUROC = 0.93) also matched or exceeded the sensitivity (range, 0.78–1.00) and specificity (range, 0.32–0.87) of 6 certified LAC DHS optometrists in the subsets of the test data set they graded.
Conclusions
A DL algorithm for detecting referable glaucoma trained using patient-level data provided by certified LAC DHS optometrists approximates or exceeds performance by ophthalmologists and optometrists, who exhibit variable sensitivity and specificity unrelated to experience level. Implementation of this algorithm in screening workflows could help reallocate resources and provide more reproducible and timely glaucoma care.
Financial Disclosure(s)
Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.