Evaluate underdiagnosis and overdiagnosis bias of deep learning model on primary open-angle glaucoma diagnosis in under-served patient populations

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science Pub Date : 2023-01-26 DOI:10.48550/arXiv.2301.11315

Mingquan Lin, Yuyun Xiao, Bojian Hou, Tingyi Wanyan, M. Sharma, Zhangyang Wang, Fei Wang, S. V. Tassel, Yifan Peng

{"title":"Evaluate underdiagnosis and overdiagnosis bias of deep learning model on primary open-angle glaucoma diagnosis in under-served patient populations","authors":"Mingquan Lin, Yuyun Xiao, Bojian Hou, Tingyi Wanyan, M. Sharma, Zhangyang Wang, Fei Wang, S. V. Tassel, Yifan Peng","doi":"10.48550/arXiv.2301.11315","DOIUrl":null,"url":null,"abstract":"In the United States, primary open-angle glaucoma (POAG) is the leading cause of blindness, especially among African American and Hispanic individuals. Deep learning has been widely used to detect POAG using fundus images as its performance is comparable to or even surpasses diagnosis by clinicians. However, human bias in clinical diagnosis may be reflected and amplified in the widely-used deep learning models, thus impacting their performance. Biases may cause (1) underdiagnosis, increasing the risks of delayed or inadequate treatment, and (2) overdiagnosis, which may increase individuals' stress, fear, well-being, and unnecessary/costly treatment. In this study, we examined the underdiagnosis and overdiagnosis when applying deep learning in POAG detection based on the Ocular Hypertension Treatment Study (OHTS) from 22 centers across 16 states in the United States. Our results show that the widely-used deep learning model can underdiagnose or overdiagnose under-served populations. The most underdiagnosed group is female younger (< 60 yrs) group, and the most overdiagnosed group is Black older (≥ 60 yrs) group. Biased diagnosis through traditional deep learning methods may delay disease detection, treatment and create burdens among under-served populations, thereby, raising ethical concerns about using deep learning models in ophthalmology clinics.","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2301.11315","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

In the United States, primary open-angle glaucoma (POAG) is the leading cause of blindness, especially among African American and Hispanic individuals. Deep learning has been widely used to detect POAG using fundus images as its performance is comparable to or even surpasses diagnosis by clinicians. However, human bias in clinical diagnosis may be reflected and amplified in the widely-used deep learning models, thus impacting their performance. Biases may cause (1) underdiagnosis, increasing the risks of delayed or inadequate treatment, and (2) overdiagnosis, which may increase individuals' stress, fear, well-being, and unnecessary/costly treatment. In this study, we examined the underdiagnosis and overdiagnosis when applying deep learning in POAG detection based on the Ocular Hypertension Treatment Study (OHTS) from 22 centers across 16 states in the United States. Our results show that the widely-used deep learning model can underdiagnose or overdiagnose under-served populations. The most underdiagnosed group is female younger (< 60 yrs) group, and the most overdiagnosed group is Black older (≥ 60 yrs) group. Biased diagnosis through traditional deep learning methods may delay disease detection, treatment and create burdens among under-served populations, thereby, raising ethical concerns about using deep learning models in ophthalmology clinics.

查看原文本刊更多论文

评估深度学习模型在缺医少药人群中原发性开角型青光眼诊断中的漏诊和过度诊断偏差

在美国，原发性开角型青光眼(POAG)是致盲的主要原因，尤其是在非洲裔美国人和西班牙裔美国人中。深度学习已被广泛用于使用眼底图像检测POAG，因为其性能可与临床医生的诊断相媲美甚至超过临床医生的诊断。然而，临床诊断中的人为偏见可能会在广泛使用的深度学习模型中得到反映和放大，从而影响其性能。偏见可能导致(1)诊断不足，增加延迟或不充分治疗的风险;(2)过度诊断，这可能增加个人的压力、恐惧、幸福感和不必要/昂贵的治疗。在这项研究中，我们基于美国16个州22个中心的高眼压治疗研究(OHTS)，研究了深度学习在POAG检测中的诊断不足和过度诊断。我们的研究结果表明，广泛使用的深度学习模型可能会对服务不足的人群诊断不足或过度诊断。漏诊率最高的是女性青年(< 60岁)组，漏诊率最高的是黑人老年(≥60岁)组。通过传统的深度学习方法进行的有偏见的诊断可能会延迟疾病的检测和治疗，并给服务不足的人群带来负担，从而引发了在眼科诊所使用深度学习模型的伦理问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science

自引率

0.00%

发文量

文献相关原料

公司名称	产品信息	采购帮参考价格