Marcello Di Giammarco , Antonella Santone , Mario Cesarelli , Fabio Martinelli , Francesco Mercaldo
{"title":"Explainable retinal disease classification and localization through Convolutional Neural Networks","authors":"Marcello Di Giammarco , Antonella Santone , Mario Cesarelli , Fabio Martinelli , Francesco Mercaldo","doi":"10.1016/j.imavis.2025.105667","DOIUrl":null,"url":null,"abstract":"<div><div>Retinal diseases pose significant challenges to vision globally, affecting a substantial portion of the population. The reliance on expert clinicians for interpreting Optical Coherence Tomography images underscores the need for automated diagnostic process. In this paper, we propose a method aimed at automatically detecting and localizing retinal disease through deep learning convolutional neural networks starting from the analysis of optical coherence tomography imaging. In detail, we propose and design a novel deep learning model, i.e., FCNNplus, for the classification task of retinal disease, reaching 93.3% in accuracy. The focus is not only on achieving a satisfying retinal disease diagnosis but also on emphasizing the role of CAM algorithms in localizing disease-specific patterns to propose a method considering the explainability and reliability behind the prediction. FCNNplus reports precise and accurate heatmaps localization, correctly identifying the presence of the retinal disease in the images. We take into account an index of similarity aimed to enhance the qualitative aspects and provide a measure of the visual explanation coming from the heatmaps (i.e. the areas of the image under analysis that, from the model point of view are symptomatic of a certain prediction).</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"162 ","pages":"Article 105667"},"PeriodicalIF":4.2000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625002550","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Retinal diseases pose significant challenges to vision globally, affecting a substantial portion of the population. The reliance on expert clinicians for interpreting Optical Coherence Tomography images underscores the need for automated diagnostic process. In this paper, we propose a method aimed at automatically detecting and localizing retinal disease through deep learning convolutional neural networks starting from the analysis of optical coherence tomography imaging. In detail, we propose and design a novel deep learning model, i.e., FCNNplus, for the classification task of retinal disease, reaching 93.3% in accuracy. The focus is not only on achieving a satisfying retinal disease diagnosis but also on emphasizing the role of CAM algorithms in localizing disease-specific patterns to propose a method considering the explainability and reliability behind the prediction. FCNNplus reports precise and accurate heatmaps localization, correctly identifying the presence of the retinal disease in the images. We take into account an index of similarity aimed to enhance the qualitative aspects and provide a measure of the visual explanation coming from the heatmaps (i.e. the areas of the image under analysis that, from the model point of view are symptomatic of a certain prediction).
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.