Validation of Deep Learning–Based Automatic Retinal Layer Segmentation Algorithms for Age-Related Macular Degeneration with 2 Spectral-Domain OCT Devices
{"title":"Validation of Deep Learning–Based Automatic Retinal Layer Segmentation Algorithms for Age-Related Macular Degeneration with 2 Spectral-Domain OCT Devices","authors":"Souvick Mukherjee PhD , Tharindu De Silva PhD , Cameron Duic BS , Gopal Jayakar BS , Tiarnan D.L. Keenan BM BCh, PhD , Alisa T. Thavikulwat MD , Emily Chew MD , Catherine Cukras MD, PhD","doi":"10.1016/j.xops.2024.100670","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>Segmentations of retinal layers in spectral-domain OCT (SD-OCT) images serve as a crucial tool for identifying and analyzing the progression of various retinal diseases, encompassing a broad spectrum of abnormalities associated with age-related macular degeneration (AMD). The training of deep learning algorithms necessitates well-defined ground truth labels, validated by experts, to delineate boundaries accurately. However, this resource-intensive process has constrained the widespread application of such algorithms across diverse OCT devices. This work validates deep learning image segmentation models across multiple OCT devices by testing robustness in generating clinically relevant metrics.</div></div><div><h3>Design</h3><div>Prospective comparative study.</div></div><div><h3>Participants</h3><div>Adults >50 years of age with no AMD to advanced AMD, as defined in the Age-Related Eye Disease Study, in ≥1 eye, were enrolled. Four hundred two SD-OCT scans were used in this study.</div></div><div><h3>Methods</h3><div>We evaluate 2 separate state-of-the-art segmentation algorithms through a training process using images obtained from 1 OCT device (Heidelberg-Spectralis) and subsequent testing using images acquired from 2 OCT devices (Heidelberg-Spectralis and Zeiss-Cirrus). This assessment is performed on a dataset that encompasses a range of retinal pathologies, spanning from disease-free conditions to severe forms of AMD, with a focus on evaluating the device independence of the algorithms.</div></div><div><h3>Main Outcome Measures</h3><div>Performance metrics (including mean squared error, mean absolute error [MAE], and Dice coefficients) for the segmentations of the internal limiting membrane (ILM), retinal pigment epithelium (RPE), and RPE to Bruch’s membrane region, along with en face thickness maps, volumetric estimations (in mm<sup>3</sup>). Violin plots and Bland–Altman plots comparing predictions against ground truth are also presented.</div></div><div><h3>Results</h3><div>The UNet and DeepLabv3, trained on Spectralis B-scans, demonstrate clinically useful outcomes when applied to Cirrus test B-scans. Review of the Cirrus test data by 2 independent annotators revealed that the aggregated MAE in pixels for ILM was 1.82 ± 0.24 (equivalent to 7.0 ± 0.9 μm) and for RPE was 2.46 ± 0.66 (9.5 ± 2.6 μm). Additionally, the Dice similarity coefficient for the RPE drusen complex region, comparing predictions to ground truth, reached 0.87 ± 0.01.</div></div><div><h3>Conclusions</h3><div>In the pursuit of task-specific goals such as retinal layer segmentation, a segmentation network has the capacity to acquire domain-independent features from a large training dataset. This enables the utilization of the network to execute tasks in domains where ground truth is hard to generate.</div></div><div><h3>Financial Disclosure(s)</h3><div>Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.</div></div>","PeriodicalId":74363,"journal":{"name":"Ophthalmology science","volume":"5 3","pages":"Article 100670"},"PeriodicalIF":3.2000,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmology science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666914524002069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose
Segmentations of retinal layers in spectral-domain OCT (SD-OCT) images serve as a crucial tool for identifying and analyzing the progression of various retinal diseases, encompassing a broad spectrum of abnormalities associated with age-related macular degeneration (AMD). The training of deep learning algorithms necessitates well-defined ground truth labels, validated by experts, to delineate boundaries accurately. However, this resource-intensive process has constrained the widespread application of such algorithms across diverse OCT devices. This work validates deep learning image segmentation models across multiple OCT devices by testing robustness in generating clinically relevant metrics.
Design
Prospective comparative study.
Participants
Adults >50 years of age with no AMD to advanced AMD, as defined in the Age-Related Eye Disease Study, in ≥1 eye, were enrolled. Four hundred two SD-OCT scans were used in this study.
Methods
We evaluate 2 separate state-of-the-art segmentation algorithms through a training process using images obtained from 1 OCT device (Heidelberg-Spectralis) and subsequent testing using images acquired from 2 OCT devices (Heidelberg-Spectralis and Zeiss-Cirrus). This assessment is performed on a dataset that encompasses a range of retinal pathologies, spanning from disease-free conditions to severe forms of AMD, with a focus on evaluating the device independence of the algorithms.
Main Outcome Measures
Performance metrics (including mean squared error, mean absolute error [MAE], and Dice coefficients) for the segmentations of the internal limiting membrane (ILM), retinal pigment epithelium (RPE), and RPE to Bruch’s membrane region, along with en face thickness maps, volumetric estimations (in mm3). Violin plots and Bland–Altman plots comparing predictions against ground truth are also presented.
Results
The UNet and DeepLabv3, trained on Spectralis B-scans, demonstrate clinically useful outcomes when applied to Cirrus test B-scans. Review of the Cirrus test data by 2 independent annotators revealed that the aggregated MAE in pixels for ILM was 1.82 ± 0.24 (equivalent to 7.0 ± 0.9 μm) and for RPE was 2.46 ± 0.66 (9.5 ± 2.6 μm). Additionally, the Dice similarity coefficient for the RPE drusen complex region, comparing predictions to ground truth, reached 0.87 ± 0.01.
Conclusions
In the pursuit of task-specific goals such as retinal layer segmentation, a segmentation network has the capacity to acquire domain-independent features from a large training dataset. This enables the utilization of the network to execute tasks in domains where ground truth is hard to generate.
Financial Disclosure(s)
Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.