George R. Nahass BA , Emma Koehler BS , Nicholas Tomaras BS , Danny Lopez BS , Madison Cheung BS , Alexander Palacios BA , Jeffrey C. Peterson MD, PhD , Sasha Hubschman MD , Kelsey Green BS , Chad A. Purnell MD , Pete Setabutr MD , Ann Q. Tran MD , Darvin Yi PhD
{"title":"Open-Source Periorbital Segmentation Dataset for Ophthalmic Applications","authors":"George R. Nahass BA , Emma Koehler BS , Nicholas Tomaras BS , Danny Lopez BS , Madison Cheung BS , Alexander Palacios BA , Jeffrey C. Peterson MD, PhD , Sasha Hubschman MD , Kelsey Green BS , Chad A. Purnell MD , Pete Setabutr MD , Ann Q. Tran MD , Darvin Yi PhD","doi":"10.1016/j.xops.2025.100757","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>We aimed to create and validate a dataset for oculoplastic segmentation and periorbital distance prediction.</div></div><div><h3>Design</h3><div>This was an experimental study.</div></div><div><h3>Subjects</h3><div>Images of faces from 2 open-source datasets were included in this study.</div></div><div><h3>Methods</h3><div>The images were sourced from 2 open-source datasets and cropped to include only the eyes. All images had the iris, sclera, lid, caruncle, and brow segmented by 5 trained annotators. Intergrader reliability analysis was done by having 5 annotators annotate the same 100 images randomly selected after at least a 2-week forgetting period. Intragrader analysis was done by having 5 annotators annotate the same 20 images after a 2-week forgetting period. Three DeepLabV3 segmentation models were trained for segmentation using the datasets following standard procedures.</div></div><div><h3>Main Outcome Measures</h3><div>The quality of the annotations was evaluated by Dice score through intragrader and intergrader experiments. Segmentation models were trained to demonstrate the dataset's utility for deep learning. The Dice score was used to evaluate deep learning models.</div></div><div><h3>Results</h3><div>We annotated 2842 images. Agreement between annotators (intergrader) on a randomly selected subset of 100 images was very high, with an average Dice score of 0.82 ± 0.01. Intragrader analysis also demonstrates that the same grader accurately reproduces annotations with an average Dice score, across all classes, of 0.81 ± 0.08. The average Dice score across all classes of a segmentation network trained on the Chicago Facial dataset, the CelebAMask-HQ dataset, and both combined was 0.90 ± 0.11, 0.81 ± 0.20, and 0.84 ± 0.18, respectively.</div></div><div><h3>Conclusions</h3><div>We have developed a first-of-its-kind dataset for use in oculoplastic and craniofacial segmentation tasks. All the annotations are publicly available for free download. Having access to segmentation datasets designed specifically for oculoplastic surgery will permit more rapid development of clinically useful segmentation networks that can be leveraged for periorbital distance prediction and other downstream tasks. In addition to the annotations, we also provide an open-source toolkit for periorbital distance prediction from segmentation masks, which are available via an application programming interface. The weights of all models have also been open-sourced and are publicly available for use by the community.</div></div><div><h3>Financial Disclosure(s)</h3><div>Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.</div></div>","PeriodicalId":74363,"journal":{"name":"Ophthalmology science","volume":"5 4","pages":"Article 100757"},"PeriodicalIF":3.2000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmology science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666914525000557","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective
We aimed to create and validate a dataset for oculoplastic segmentation and periorbital distance prediction.
Design
This was an experimental study.
Subjects
Images of faces from 2 open-source datasets were included in this study.
Methods
The images were sourced from 2 open-source datasets and cropped to include only the eyes. All images had the iris, sclera, lid, caruncle, and brow segmented by 5 trained annotators. Intergrader reliability analysis was done by having 5 annotators annotate the same 100 images randomly selected after at least a 2-week forgetting period. Intragrader analysis was done by having 5 annotators annotate the same 20 images after a 2-week forgetting period. Three DeepLabV3 segmentation models were trained for segmentation using the datasets following standard procedures.
Main Outcome Measures
The quality of the annotations was evaluated by Dice score through intragrader and intergrader experiments. Segmentation models were trained to demonstrate the dataset's utility for deep learning. The Dice score was used to evaluate deep learning models.
Results
We annotated 2842 images. Agreement between annotators (intergrader) on a randomly selected subset of 100 images was very high, with an average Dice score of 0.82 ± 0.01. Intragrader analysis also demonstrates that the same grader accurately reproduces annotations with an average Dice score, across all classes, of 0.81 ± 0.08. The average Dice score across all classes of a segmentation network trained on the Chicago Facial dataset, the CelebAMask-HQ dataset, and both combined was 0.90 ± 0.11, 0.81 ± 0.20, and 0.84 ± 0.18, respectively.
Conclusions
We have developed a first-of-its-kind dataset for use in oculoplastic and craniofacial segmentation tasks. All the annotations are publicly available for free download. Having access to segmentation datasets designed specifically for oculoplastic surgery will permit more rapid development of clinically useful segmentation networks that can be leveraged for periorbital distance prediction and other downstream tasks. In addition to the annotations, we also provide an open-source toolkit for periorbital distance prediction from segmentation masks, which are available via an application programming interface. The weights of all models have also been open-sourced and are publicly available for use by the community.
Financial Disclosure(s)
Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.