Accuracy of a New Foundation Model in Glaucoma Detection using Ocular Coherence Tomography Images

medRxiv - Ophthalmology Pub Date : 2024-08-05 DOI:10.1101/2024.08.04.24311475

Benton Chuter, Justin Huynh, Evan Walker, Shahin Hallaj, Jalil Jalili, Jeffrey Liebmann, Massimo A Fazio, Christopher A Girkin, Robert N Weinreb, Mark Christopher, Linda M Zangwill

{"title":"Accuracy of a New Foundation Model in Glaucoma Detection using Ocular Coherence Tomography Images","authors":"Benton Chuter, Justin Huynh, Evan Walker, Shahin Hallaj, Jalil Jalili, Jeffrey Liebmann, Massimo A Fazio, Christopher A Girkin, Robert N Weinreb, Mark Christopher, Linda M Zangwill","doi":"10.1101/2024.08.04.24311475","DOIUrl":null,"url":null,"abstract":"Purpose: To fine tune and evaluate the performance of the retinal foundation model (RETFound) on a diverse longitudinal clinical research dataset in glaucoma detection from optical coherence tomography (OCT) RNFL scans. Subanalyses of the model performance were evaluated across different subgroups, various dataset sample sizes and training cycles (epochs). Design: Evaluation of a diagnostic technology Subjects, Participants, and Controls: 15,216 Spectralis OCT RNFL circle scans of 747 individuals of diverse race (56.9% White, 37.8% Black / African American, and 5.3% Other / Not reported (5.3%), glaucoma severity (30.8% mild, 18.4% moderate-to-severe, and 50.9% no glaucoma), and age (44.8% <60 years, 55.2% >60 years) from the Diagnostic Innovations in Glaucoma Study (DIGS) and the African Descent and Glaucoma Evaluation Study (ADAGES). All OCT b scans were labeled as \"Non-glaucomatous\" or \"Glaucomatous.\" Methods: RETFound was employed to perform binary glaucoma classification. The diagnostic accuracy of RETFound was iteratively tested across different combinations of dataset sample sizes (50 to 2000 OCT RNFL circle scans), epochs (5 to 50), and study subpopulations stratified by severity of glaucoma, age, and race). Main Outcome Measures: Area under receiver operating characteristic curve (AUC) for classifying RNFL scans as \"Non-glaucomatous\" or \"Glaucomatous.\" Results: Performance metrics improved with larger training datasets and more training cycles, rising from an AUC of 0.61 (50 training images and 5 epochs) to AUC 0.91 (2,000 training images and 50 epochs). Gains in performance were marginal as training size increased beyond 500 scans. Performance was similar across race for all training size and cycle number combinations: African American (AUC=0.90) vs other (AUC=0.93). RNFL scans from older patients (>60 years) led to worse performance (AUC=0.85) compared to younger patients (<60 years, AUC=0.95), Performance was significantly higher for RNFL scans from patients with moderate-to-severe glaucoma vs mild glaucoma (AUC=0.99 vs 0.88, respectively). Conclusions: Good RETFound performance was observed with a relatively small sample size of images used for fine tuning and across differences in race and age. The ability of RETFound to adapt across a range of OCT training conditions and populations suggests it is a promising tool to automate glaucoma detection in a variety of use cases.","PeriodicalId":501390,"journal":{"name":"medRxiv - Ophthalmology","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Ophthalmology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.08.04.24311475","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: To fine tune and evaluate the performance of the retinal foundation model (RETFound) on a diverse longitudinal clinical research dataset in glaucoma detection from optical coherence tomography (OCT) RNFL scans. Subanalyses of the model performance were evaluated across different subgroups, various dataset sample sizes and training cycles (epochs). Design: Evaluation of a diagnostic technology Subjects, Participants, and Controls: 15,216 Spectralis OCT RNFL circle scans of 747 individuals of diverse race (56.9% White, 37.8% Black / African American, and 5.3% Other / Not reported (5.3%), glaucoma severity (30.8% mild, 18.4% moderate-to-severe, and 50.9% no glaucoma), and age (44.8% <60 years, 55.2% >60 years) from the Diagnostic Innovations in Glaucoma Study (DIGS) and the African Descent and Glaucoma Evaluation Study (ADAGES). All OCT b scans were labeled as "Non-glaucomatous" or "Glaucomatous." Methods: RETFound was employed to perform binary glaucoma classification. The diagnostic accuracy of RETFound was iteratively tested across different combinations of dataset sample sizes (50 to 2000 OCT RNFL circle scans), epochs (5 to 50), and study subpopulations stratified by severity of glaucoma, age, and race). Main Outcome Measures: Area under receiver operating characteristic curve (AUC) for classifying RNFL scans as "Non-glaucomatous" or "Glaucomatous." Results: Performance metrics improved with larger training datasets and more training cycles, rising from an AUC of 0.61 (50 training images and 5 epochs) to AUC 0.91 (2,000 training images and 50 epochs). Gains in performance were marginal as training size increased beyond 500 scans. Performance was similar across race for all training size and cycle number combinations: African American (AUC=0.90) vs other (AUC=0.93). RNFL scans from older patients (>60 years) led to worse performance (AUC=0.85) compared to younger patients (<60 years, AUC=0.95), Performance was significantly higher for RNFL scans from patients with moderate-to-severe glaucoma vs mild glaucoma (AUC=0.99 vs 0.88, respectively). Conclusions: Good RETFound performance was observed with a relatively small sample size of images used for fine tuning and across differences in race and age. The ability of RETFound to adapt across a range of OCT training conditions and populations suggests it is a promising tool to automate glaucoma detection in a variety of use cases.

查看原文本刊更多论文

利用眼相干断层扫描图像检测青光眼的新基础模型的准确性

目的：微调和评估视网膜基础模型（RETFound）在从光学相干断层扫描（OCT）RNFL扫描中检测青光眼的各种纵向临床研究数据集上的性能。在不同的分组、不同的数据集样本大小和训练周期（epochs）中对模型性能进行了子分析评估。设计：诊断技术评估对象、参与者和对照组：不同种族（56.9% 白人、37.8% 黑人/非裔美国人、5.3% 其他/未报告（5.3%））、青光眼严重程度（30.8% 轻度、18.4%为中重度，50.9%为无青光眼）、年龄（44.8%为 60 岁，55.2%为 60 岁），均来自青光眼诊断创新研究（DIGS）和非洲裔与青光眼评估研究（ADAGES）。所有 OCT b 扫描结果都被标记为 "非青光眼 "或 "青光眼"。方法：采用 RETFound 进行二元青光眼分类。在数据集样本大小（50 到 2000 个 OCT RNFL 圆扫描）、时间间隔（5 到 50 个）以及按青光眼严重程度、年龄和种族分层的研究亚群的不同组合中，对 RETFound 的诊断准确性进行了反复测试。）主要结果测量：将 RNFL 扫描分为 "非青光眼 "或 "青光眼 "的接收者操作特征曲线下面积 (AUC)。结果性能指标随着训练数据集的增大和训练周期的增加而提高，从 AUC 0.61（50 幅训练图像和 5 个历时）上升到 AUC 0.91（2,000 幅训练图像和 50 个历时）。当训练规模超过 500 次扫描时，性能的提高就微乎其微了。在所有训练规模和周期数组合中，不同种族的表现相似：非裔美国人（AUC=0.90） vs 其他种族（AUC=0.93）。与年轻患者（60 岁，AUC=0.95）相比，老年患者（60 岁）的 RNFL 扫描表现较差（AUC=0.85），中重度青光眼患者与轻度青光眼患者的 RNFL 扫描表现明显更高（AUC 分别为 0.99 与 0.88）。结论：尽管用于微调的图像样本量相对较小，但在不同种族和年龄的情况下，RETFound 仍能表现出良好的性能。RETFound 能够适应各种 OCT 训练条件和人群，这表明它是一种很有前途的工具，能在各种情况下自动检测青光眼。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

medRxiv - Ophthalmology

自引率

0.00%

发文量