Daniel Milad MD , Fares Antaki MDCM , David Mikhail MD (C), MSc (C) , Andrew Farah MDCM (C) , Jonathan El-Khoury MD , Samir Touma MD , Georges M. Durr MD , Taylor Nayman MD , Clément Playout PhD (C) , Pearse A. Keane MD, FRCOphth , Renaud Duval MD
{"title":"Code-Free Deep Learning Glaucoma Detection on Color Fundus Images","authors":"Daniel Milad MD , Fares Antaki MDCM , David Mikhail MD (C), MSc (C) , Andrew Farah MDCM (C) , Jonathan El-Khoury MD , Samir Touma MD , Georges M. Durr MD , Taylor Nayman MD , Clément Playout PhD (C) , Pearse A. Keane MD, FRCOphth , Renaud Duval MD","doi":"10.1016/j.xops.2025.100721","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Code-free deep learning (CFDL) allows clinicians with no coding experience to build their own artificial intelligence models. This study assesses the performance of CFDL in glaucoma detection from fundus images in comparison to expert-designed models.</div></div><div><h3>Design</h3><div>Deep learning model development, testing, and validation.</div></div><div><h3>Subjects</h3><div>A total of 101 442 labeled fundus images from the Rotterdam EyePACS Artificial Intelligence for Robust Glaucoma Screening (AIROGS) dataset were included.</div></div><div><h3>Methods</h3><div>Ophthalmology trainees without coding experience designed a CFDL binary model using the Rotterdam EyePACS AIROGS dataset of fundus images (101 442 labeled images) to differentiate glaucoma from normal optic nerves. We compared our results with bespoke models from the literature. We then proceeded to externally validate our model using 2 datasets, the Retinal Fundus Glaucoma Challenge (REFUGE) and the Glaucoma grading from Multi-Modality imAges (GAMMA) at 0.1, 0.3, and 0.5 confidence thresholds.</div></div><div><h3>Main Outcome Measures</h3><div>Area under the precision-recall curve (AuPRC), sensitivity at 95% specificity (SE@95SP), accuracy, area under the receiver operating curve (AUC), and positive predictive value (PPV).</div></div><div><h3>Results</h3><div>The CFDL model showed high performance metrics that were comparable to the bespoke deep learning models. Our single-label classification model had an AuPRC of 0.988, an SE@95SP of 95%, and an accuracy of 91% (compared with 85% SE@95SP for the top bespoke models). Using the REFUGE dataset for external validation, our model had an SE@95SP, AUC, PPV, and accuracy of 83%, 0.960%, 73% to 94%, and 95% to 98%, respectively, at the 0.1, 0.3, and 0.5 confidence threshold cutoffs. Using the GAMMA dataset for external validation at the same confidence threshold cutoffs, our model had an SE@95SP, AUC, PPV, and accuracy of 98%, 0.994%, 94% to 96%, and 94% to 97%, respectively.</div></div><div><h3>Conclusion</h3><div>The capacity of CFDL models to perform glaucoma screening using fundus images presents a compelling proof of concept, empowering clinicians to explore innovative model designs for broad glaucoma screening in the near future.</div></div><div><h3>Financial Disclosure(s)</h3><div>Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.</div></div>","PeriodicalId":74363,"journal":{"name":"Ophthalmology science","volume":"5 4","pages":"Article 100721"},"PeriodicalIF":3.2000,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmology science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666914525000193","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective
Code-free deep learning (CFDL) allows clinicians with no coding experience to build their own artificial intelligence models. This study assesses the performance of CFDL in glaucoma detection from fundus images in comparison to expert-designed models.
Design
Deep learning model development, testing, and validation.
Subjects
A total of 101 442 labeled fundus images from the Rotterdam EyePACS Artificial Intelligence for Robust Glaucoma Screening (AIROGS) dataset were included.
Methods
Ophthalmology trainees without coding experience designed a CFDL binary model using the Rotterdam EyePACS AIROGS dataset of fundus images (101 442 labeled images) to differentiate glaucoma from normal optic nerves. We compared our results with bespoke models from the literature. We then proceeded to externally validate our model using 2 datasets, the Retinal Fundus Glaucoma Challenge (REFUGE) and the Glaucoma grading from Multi-Modality imAges (GAMMA) at 0.1, 0.3, and 0.5 confidence thresholds.
Main Outcome Measures
Area under the precision-recall curve (AuPRC), sensitivity at 95% specificity (SE@95SP), accuracy, area under the receiver operating curve (AUC), and positive predictive value (PPV).
Results
The CFDL model showed high performance metrics that were comparable to the bespoke deep learning models. Our single-label classification model had an AuPRC of 0.988, an SE@95SP of 95%, and an accuracy of 91% (compared with 85% SE@95SP for the top bespoke models). Using the REFUGE dataset for external validation, our model had an SE@95SP, AUC, PPV, and accuracy of 83%, 0.960%, 73% to 94%, and 95% to 98%, respectively, at the 0.1, 0.3, and 0.5 confidence threshold cutoffs. Using the GAMMA dataset for external validation at the same confidence threshold cutoffs, our model had an SE@95SP, AUC, PPV, and accuracy of 98%, 0.994%, 94% to 96%, and 94% to 97%, respectively.
Conclusion
The capacity of CFDL models to perform glaucoma screening using fundus images presents a compelling proof of concept, empowering clinicians to explore innovative model designs for broad glaucoma screening in the near future.
Financial Disclosure(s)
Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.