糖尿病视网膜病变检测中人工智能可解释性的热图分析:阐明深度学习决策的合理性。

4区 医学
Annals of translational medicine Pub Date : 2024-10-20 Epub Date: 2024-10-12 DOI:10.21037/atm-24-73
Fernando Korn Malerbi, Luis Filipe Nakayama, Paulo Prado, Fernando Yamanaka, Gustavo Barreto Melo, Caio Vinicius Regatieri, José Augusto Stuchi
{"title":"糖尿病视网膜病变检测中人工智能可解释性的热图分析:阐明深度学习决策的合理性。","authors":"Fernando Korn Malerbi, Luis Filipe Nakayama, Paulo Prado, Fernando Yamanaka, Gustavo Barreto Melo, Caio Vinicius Regatieri, José Augusto Stuchi","doi":"10.21037/atm-24-73","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The opaqueness of artificial intelligence (AI) algorithms decision processes limit their application in healthcare. Our objective was to explore discrepancies in heatmaps originated from slightly different retinal images from the same eyes of individuals with diabetes, to gain insights into the deep learning (DL) decision process.</p><p><strong>Methods: </strong>Pairs of retinal images from the same eyes of individuals with diabetes, composed of images obtained before and after pupil dilation, underwent automatic analysis by a convolutional neural network for the presence of diabetic retinopathy (DR), output being a score ranging from 0 to 1. Gradient-based Class Activation Maps (GradCam) allowed visualization of activated areas. Pairs of images with discordant DL scores or outputs within the pair were objectively compared to the concordant pairs, regarding the sum of activations of Class Activation Mapping (CAM), the number of activated areas, and DL score differences. Heatmaps of discordant pairs were also qualitatively assessed.</p><p><strong>Results: </strong>Algorithmic performance for the detection of DR attained 89.8% sensitivity, 96.3% specificity and area under the receiver operating characteristic (ROC) curve of 0.95. Out of 210 comparable pairs of images, 20 eyes and 10 eyes were considered discordant according to DL score difference and regarding DL output, respectively. Comparison of concordant versus discordant groups showed statistically significant differences for all objective variables. Qualitative analysis pointed to subtle differences in image quality within discordant pairs.</p><p><strong>Conclusions: </strong>The successfully established relationship among objective parameters extracted from heatmaps and DL output discrepancies reinforces the role of heatmaps for DL explainability, fostering acceptance of DL systems for clinical use.</p>","PeriodicalId":8216,"journal":{"name":"Annals of translational medicine","volume":"12 5","pages":"89"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11534741/pdf/","citationCount":"0","resultStr":"{\"title\":\"Heatmap analysis for artificial intelligence explainability in diabetic retinopathy detection: illuminating the rationale of deep learning decisions.\",\"authors\":\"Fernando Korn Malerbi, Luis Filipe Nakayama, Paulo Prado, Fernando Yamanaka, Gustavo Barreto Melo, Caio Vinicius Regatieri, José Augusto Stuchi\",\"doi\":\"10.21037/atm-24-73\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The opaqueness of artificial intelligence (AI) algorithms decision processes limit their application in healthcare. Our objective was to explore discrepancies in heatmaps originated from slightly different retinal images from the same eyes of individuals with diabetes, to gain insights into the deep learning (DL) decision process.</p><p><strong>Methods: </strong>Pairs of retinal images from the same eyes of individuals with diabetes, composed of images obtained before and after pupil dilation, underwent automatic analysis by a convolutional neural network for the presence of diabetic retinopathy (DR), output being a score ranging from 0 to 1. Gradient-based Class Activation Maps (GradCam) allowed visualization of activated areas. Pairs of images with discordant DL scores or outputs within the pair were objectively compared to the concordant pairs, regarding the sum of activations of Class Activation Mapping (CAM), the number of activated areas, and DL score differences. Heatmaps of discordant pairs were also qualitatively assessed.</p><p><strong>Results: </strong>Algorithmic performance for the detection of DR attained 89.8% sensitivity, 96.3% specificity and area under the receiver operating characteristic (ROC) curve of 0.95. Out of 210 comparable pairs of images, 20 eyes and 10 eyes were considered discordant according to DL score difference and regarding DL output, respectively. Comparison of concordant versus discordant groups showed statistically significant differences for all objective variables. Qualitative analysis pointed to subtle differences in image quality within discordant pairs.</p><p><strong>Conclusions: </strong>The successfully established relationship among objective parameters extracted from heatmaps and DL output discrepancies reinforces the role of heatmaps for DL explainability, fostering acceptance of DL systems for clinical use.</p>\",\"PeriodicalId\":8216,\"journal\":{\"name\":\"Annals of translational medicine\",\"volume\":\"12 5\",\"pages\":\"89\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11534741/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of translational medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.21037/atm-24-73\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/10/12 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of translational medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.21037/atm-24-73","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/12 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景:人工智能(AI)算法决策过程的不透明性限制了其在医疗保健领域的应用。我们的目标是探索来自糖尿病患者同一双眼略有不同的视网膜图像的热图差异,从而深入了解深度学习(DL)的决策过程:糖尿病患者同一只眼睛的视网膜图像对(由散瞳前后获得的图像组成)通过卷积神经网络自动分析是否存在糖尿病视网膜病变(DR),输出为 0 到 1 分。基于梯度的类活化图(GradCam)可将活化区域可视化。在类活化图(CAM)的活化总和、活化区域数量和 DL 分数差异方面,将 DL 分数或输出不一致的图像对与一致的图像对进行客观比较。此外,还对不一致配对的热图进行了定性评估:DR 检测算法的灵敏度为 89.8%,特异度为 96.3%,接收者操作特征曲线下面积为 0.95。在 210 对具有可比性的图像中,根据 DL 评分差异和 DL 输出,分别有 20 只眼睛和 10 只眼睛被认为是不一致的。比较一致组和不一致组,发现所有客观变量都有显著的统计学差异。定性分析显示,不一致组的图像质量存在细微差别:从热图中提取的客观参数与 DL 输出差异之间成功建立的关系加强了热图在 DL 可解释性方面的作用,促进了临床使用 DL 系统的接受度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Heatmap analysis for artificial intelligence explainability in diabetic retinopathy detection: illuminating the rationale of deep learning decisions.

Background: The opaqueness of artificial intelligence (AI) algorithms decision processes limit their application in healthcare. Our objective was to explore discrepancies in heatmaps originated from slightly different retinal images from the same eyes of individuals with diabetes, to gain insights into the deep learning (DL) decision process.

Methods: Pairs of retinal images from the same eyes of individuals with diabetes, composed of images obtained before and after pupil dilation, underwent automatic analysis by a convolutional neural network for the presence of diabetic retinopathy (DR), output being a score ranging from 0 to 1. Gradient-based Class Activation Maps (GradCam) allowed visualization of activated areas. Pairs of images with discordant DL scores or outputs within the pair were objectively compared to the concordant pairs, regarding the sum of activations of Class Activation Mapping (CAM), the number of activated areas, and DL score differences. Heatmaps of discordant pairs were also qualitatively assessed.

Results: Algorithmic performance for the detection of DR attained 89.8% sensitivity, 96.3% specificity and area under the receiver operating characteristic (ROC) curve of 0.95. Out of 210 comparable pairs of images, 20 eyes and 10 eyes were considered discordant according to DL score difference and regarding DL output, respectively. Comparison of concordant versus discordant groups showed statistically significant differences for all objective variables. Qualitative analysis pointed to subtle differences in image quality within discordant pairs.

Conclusions: The successfully established relationship among objective parameters extracted from heatmaps and DL output discrepancies reinforces the role of heatmaps for DL explainability, fostering acceptance of DL systems for clinical use.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
769
期刊介绍: The Annals of Translational Medicine (Ann Transl Med; ATM; Print ISSN 2305-5839; Online ISSN 2305-5847) is an international, peer-reviewed Open Access journal featuring original and observational investigations in the broad fields of laboratory, clinical, and public health research, aiming to provide practical up-to-date information in significant research from all subspecialties of medicine and to broaden the readers’ vision and horizon from bench to bed and bed to bench. It is published quarterly (April 2013- Dec. 2013), monthly (Jan. 2014 - Feb. 2015), biweekly (March 2015-) and openly distributed worldwide. Annals of Translational Medicine is indexed in PubMed in Sept 2014 and in SCIE in 2018. Specific areas of interest include, but not limited to, multimodality therapy, epidemiology, biomarkers, imaging, biology, pathology, and technical advances related to medicine. Submissions describing preclinical research with potential for application to human disease, and studies describing research obtained from preliminary human experimentation with potential to further the understanding of biological mechanism underlying disease are encouraged. Also warmly welcome are studies describing public health research pertinent to clinic, disease diagnosis and prevention, or healthcare policy.
 With a focus on interdisciplinary academic cooperation, ATM aims to expedite the translation of scientific discovery into new or improved standards of management and health outcomes practice.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信