Statistical Evaluation of Smartphone-Based Automated Grading System for Ocular Redness Associated with Dry Eye Disease and Implications for Clinical Trials.

Clinical ophthalmology (Auckland, N.Z.) Pub Date : 2025-03-13 eCollection Date: 2025-01-01 DOI:10.2147/OPTH.S506519
John D Rodriguez, Adam Hamm, Ethan Bensinger, Samanatha J Kerti, Paul J Gomes, George W Ousler Iii, Palak Gupta, Carlos Gustavo De Moraes, Mark B Abelson
{"title":"Statistical Evaluation of Smartphone-Based Automated Grading System for Ocular Redness Associated with Dry Eye Disease and Implications for Clinical Trials.","authors":"John D Rodriguez, Adam Hamm, Ethan Bensinger, Samanatha J Kerti, Paul J Gomes, George W Ousler Iii, Palak Gupta, Carlos Gustavo De Moraes, Mark B Abelson","doi":"10.2147/OPTH.S506519","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>This study introduces a fully automated approach using deep learning-based segmentation to select the conjunctiva as the region of interest (ROI) for large-scale, multi-site clinical trials. By integrating a precise, objective grading system, we aim to minimize inter- and intra-grader variability due to perceptual biases. We evaluate the impact of adding a \"horizontality\" parameter to the grading system and assess this method's potential to enhance grading precision, reduce sample size, and improve clinical trial efficiency.</p><p><strong>Methods: </strong>We analyzed 29,640 images from 450 subjects in a multi-visit, multi-site clinical trial to assess the performance of an automated grading model compared to expert graders. Images were graded on a 0-4 scale, in 0.5 increments. The model utilizes the DeepLabV3 architecture for image segmentation, extracting two key features-horizontality and redness. The algorithm then uses these features to predict eye redness, validated by comparison with expert grader scores.</p><p><strong>Results: </strong>The bivariate model using both redness and horizontality performed best, with a Mean Absolute Error (MAE) of 0.450 points (SD=0.334) on the redness scale relative to expert scores. Expert graded scores were within one unit of the mean grade in over 85% cases, ensuring consistency and optimal training set for the predictive model. Models incorporating both features outperformed those using only redness, reducing MAE by 5-6%. The optimal generalized model improved predictive accuracy with horizontality such that 93.0% of images were predicted with an absolute error less than one unit difference in grading.</p><p><strong>Conclusion: </strong>This study demonstrates that fully automating image analysis allows thousands of images to be graded efficiently. The addition of the horizontality parameter enhances model performance, reduces error, and supports its relevance to specific Dry Eye manifestations. This automated method provides a continuous scale and greater sensitivity to treatment effects than standard clinical scales.</p>","PeriodicalId":93945,"journal":{"name":"Clinical ophthalmology (Auckland, N.Z.)","volume":"19 ","pages":"907-914"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11912931/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical ophthalmology (Auckland, N.Z.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2147/OPTH.S506519","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: This study introduces a fully automated approach using deep learning-based segmentation to select the conjunctiva as the region of interest (ROI) for large-scale, multi-site clinical trials. By integrating a precise, objective grading system, we aim to minimize inter- and intra-grader variability due to perceptual biases. We evaluate the impact of adding a "horizontality" parameter to the grading system and assess this method's potential to enhance grading precision, reduce sample size, and improve clinical trial efficiency.

Methods: We analyzed 29,640 images from 450 subjects in a multi-visit, multi-site clinical trial to assess the performance of an automated grading model compared to expert graders. Images were graded on a 0-4 scale, in 0.5 increments. The model utilizes the DeepLabV3 architecture for image segmentation, extracting two key features-horizontality and redness. The algorithm then uses these features to predict eye redness, validated by comparison with expert grader scores.

Results: The bivariate model using both redness and horizontality performed best, with a Mean Absolute Error (MAE) of 0.450 points (SD=0.334) on the redness scale relative to expert scores. Expert graded scores were within one unit of the mean grade in over 85% cases, ensuring consistency and optimal training set for the predictive model. Models incorporating both features outperformed those using only redness, reducing MAE by 5-6%. The optimal generalized model improved predictive accuracy with horizontality such that 93.0% of images were predicted with an absolute error less than one unit difference in grading.

Conclusion: This study demonstrates that fully automating image analysis allows thousands of images to be graded efficiently. The addition of the horizontality parameter enhances model performance, reduces error, and supports its relevance to specific Dry Eye manifestations. This automated method provides a continuous scale and greater sensitivity to treatment effects than standard clinical scales.

基于智能手机的干眼病相关眼红肿自动分级系统的统计评估及其临床试验意义
目的:本研究引入了一种全自动方法,使用基于深度学习的分割来选择结膜作为兴趣区域(ROI),用于大规模,多地点的临床试验。通过整合一个精确、客观的评分系统,我们的目标是最大限度地减少由于感知偏差造成的评分者之间和内部的差异。我们评估了在分级系统中加入“水平”参数的影响,并评估了该方法在提高分级精度、减少样本量和提高临床试验效率方面的潜力。方法:我们分析了来自450名受试者的29,640张图像,这些图像来自一项多访问、多地点的临床试验,以评估与专家评分相比自动评分模型的性能。图像按0-4级进行评分,以0.5为增量。该模型利用DeepLabV3架构进行图像分割,提取两个关键特征-水平和红色。然后,该算法使用这些特征来预测眼睛发红,并通过与专家评分者的分数进行比较来验证。结果:同时使用红度和水平度的双变量模型表现最佳,相对于专家评分,红度量表的平均绝对误差(MAE)为0.450点(SD=0.334)。在超过85%的情况下,专家评分在平均评分的一个单位内,确保了预测模型的一致性和最佳训练集。结合这两种特征的模型比只使用红色的模型表现更好,MAE降低了5-6%。最优广义模型提高了水平水平的预测精度,使得93.0%的图像预测的绝对误差小于一个单位的分级差。结论:本研究表明,完全自动化的图像分析可以有效地对数千张图像进行分级。水平参数的增加提高了模型性能,减少了误差,并支持其与特定干眼症表现的相关性。这种自动化的方法提供了连续的规模和更大的敏感性治疗效果比标准的临床规模。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.10
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信