DR-TrustNet: Enhancing diabetic retinopathy detection using reliable efficient networks and uncertainty quantification

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Image and Vision Computing Pub Date : 2026-04-01 Epub Date: 2026-01-31 DOI:10.1016/j.imavis.2026.105921

Preeti Verma , Sivasankar Elango , Kunwar Singh

{"title":"DR-TrustNet: Enhancing diabetic retinopathy detection using reliable efficient networks and uncertainty quantification","authors":"Preeti Verma , Sivasankar Elango , Kunwar Singh","doi":"10.1016/j.imavis.2026.105921","DOIUrl":null,"url":null,"abstract":"<div><div>Diabetic retinopathy (DR) is one of the main reasons people lose their vision, and catching it early is key to stopping permanent damage. Right now, doctors rely on manual screening, which takes a lot of time and it is not always consistent. The introduction of deep neural networks (DNNs) is a revolutionary step in analyzing high-precision DR detection, but there are concerns: these models can be over-confident in their prediction, leading to mistakes, especially in critical health care. Another problem is that the current method of deep learning does not respond well to uncertainties, which makes it difficult to trust them in the real medical environment. To address these challenges, we have developed a new system of three components. First, we improved the quality of retinal images using the Adaptive Fundus Enhancement Pipeline (AFEP). Then we will extract more useful features from the image using a modified version of EfficientNet-B0. Finally, we add steps to calibrate the model's prediction to ensure that its level of confidence is actually accurate. This step reduces the chances of incorrect diagnosis by utilizing a test time data augmentation and temperature scaling. The results of the IDRiD dataset test were promising. The model achieved 96% accuracy and showed a much better uncertainty calibration, with an expected calibration error of only 0.030. In other words, it is not only accurate, but also more reliable in the real world. Overall, our methodology can make AI-based DR screening more practical and reliable for both doctors and patients.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"168 ","pages":"Article 105921"},"PeriodicalIF":4.2000,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885626000272","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/1/31 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Diabetic retinopathy (DR) is one of the main reasons people lose their vision, and catching it early is key to stopping permanent damage. Right now, doctors rely on manual screening, which takes a lot of time and it is not always consistent. The introduction of deep neural networks (DNNs) is a revolutionary step in analyzing high-precision DR detection, but there are concerns: these models can be over-confident in their prediction, leading to mistakes, especially in critical health care. Another problem is that the current method of deep learning does not respond well to uncertainties, which makes it difficult to trust them in the real medical environment. To address these challenges, we have developed a new system of three components. First, we improved the quality of retinal images using the Adaptive Fundus Enhancement Pipeline (AFEP). Then we will extract more useful features from the image using a modified version of EfficientNet-B0. Finally, we add steps to calibrate the model's prediction to ensure that its level of confidence is actually accurate. This step reduces the chances of incorrect diagnosis by utilizing a test time data augmentation and temperature scaling. The results of the IDRiD dataset test were promising. The model achieved 96% accuracy and showed a much better uncertainty calibration, with an expected calibration error of only 0.030. In other words, it is not only accurate, but also more reliable in the real world. Overall, our methodology can make AI-based DR screening more practical and reliable for both doctors and patients.

Abstract Image

查看原文本刊更多论文

DR-TrustNet：利用可靠高效的网络和不确定性量化加强糖尿病视网膜病变的检测

糖尿病视网膜病变（DR）是人们失去视力的主要原因之一，及早发现是防止永久性损伤的关键。目前，医生依靠人工筛查，这需要花费大量时间，而且并不总是一致的。深度神经网络（dnn）的引入是分析高精度DR检测的革命性步骤，但也存在一些担忧：这些模型可能对其预测过于自信，导致错误，特别是在关键的医疗保健领域。另一个问题是，目前的深度学习方法不能很好地应对不确定性，这使得在真实的医疗环境中很难信任它们。为了应对这些挑战，我们开发了一个由三个部分组成的新系统。首先，我们使用自适应眼底增强管道（AFEP）来提高视网膜图像的质量。然后，我们将使用改进版的EfficientNet-B0从图像中提取更多有用的特征。最后，我们增加了校准模型预测的步骤，以确保其置信度实际上是准确的。此步骤通过利用测试时间数据增强和温度缩放来减少错误诊断的机会。IDRiD数据集测试的结果很有希望。模型精度达到96%，不确定度校正效果较好，预期校正误差仅为0.030。换句话说，它不仅准确，而且在现实世界中更可靠。总的来说，我们的方法可以使基于人工智能的DR筛查对医生和患者都更加实用和可靠。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Image and Vision Computing 工程技术-工程：电子与电气

CiteScore

8.50

自引率

8.50%

发文量

143

审稿时长

7.8 months

期刊介绍： Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.