Analyzing explainability of YOLO-based breast cancer detection using heat map visualizations.

IF 2.3 2区 医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Awika Ariyametkul, May Phu Paing
{"title":"Analyzing explainability of YOLO-based breast cancer detection using heat map visualizations.","authors":"Awika Ariyametkul, May Phu Paing","doi":"10.21037/qims-2024-2911","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Breast cancer is the most frequently diagnosed and leading cause of cancer-related mortality among women worldwide. The danger of this disease is due to its asymptomatic nature in the early stages, thereby underscoring the importance of early detection. Mammography, a specialized X-ray imaging technique for breast examination, has been pivotal in facilitating early detection and reducing mortality rates. In recent years, artificial intelligence (AI) has gained substantial popularity across various fields, including medicine. Numerous studies have leveraged AI techniques, particularly convolutional neural networks (CNNs) and You Only Look Once (YOLO)-based models, for medical image detection and classification. However, the predictions of such AI models often lack transparency and explainability, resulting in low trustworthiness. This study aims to address this gap by investigating three state-of-the-art versions of the YOLO algorithm-YOLO version 9 (YOLOv9), YOLO version 10 (YOLOv10), and YOLO version 11 (YOLO11)-trained on breast cancer imaging datasets, specifically the INbreast and Mammographic Image Analysis Society (MIAS) databases. Additionally, to address the challenges posed by the lack of explainability and transparency, we integrate seven explainable artificial intelligence (XAI) methods: Grad-CAM, Grad-CAM++, Eigen-CAM, EigenGrad-CAM, XGrad-CAM, LayerCAM, and HiResCAM.</p><p><strong>Methods: </strong>This study utilized two publicly available breast cancer image databases: INbreast: toward a Full-field Digital Mammographic Database and the MIAS dataset. Preprocessing steps were applied to standardize all images in accordance with the input requirements of the YOLO architecture, as these datasets were used to train the three most recent versions of YOLO. The YOLO model demonstrating the highest performance-measured by mean average precision (mAP), precision, and recall-was selected for integration with seven different XAI methods. The performance of each XAI technique was evaluated both qualitatively through visual inspection and quantitatively using several metrics, including matching ground truth (mGT), Pearson correlation coefficient (PCC), precision, recall, and root mean square error (RMSE). These methodologies were employed to interpret and visualize the \"black box\" decision-making processes of the top-performing YOLO model.</p><p><strong>Results: </strong>Based on our experimental findings, YOLO11 outperformed YOLOv9 (mAP 0.868) and YOLOv10 (mAP 0.926), achieving the highest mAP of 0.935, with classification accuracies of 95% for benign and 80% for malignant cases. Among the evaluated XAI techniques, HiResCAM provided the most effective visual explanations, attaining the highest mGT score of 0.49, surpassing EigenGrad-CAM (0.45) and LayerCAM (0.42) in both visual and quantitative evaluations.</p><p><strong>Conclusions: </strong>The integration of YOLO11 with HiResCAM offers a robust solution that combines high detection accuracy with improved model interpretability. This approach not only enhances user trustworthiness by revealing decision-making patterns and limitations but also provide insights into the weaknesses of the model, enabling developers to refine and improve AI performance further.</p>","PeriodicalId":54267,"journal":{"name":"Quantitative Imaging in Medicine and Surgery","volume":"15 7","pages":"6252-6271"},"PeriodicalIF":2.3000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12290753/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Imaging in Medicine and Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.21037/qims-2024-2911","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/30 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Breast cancer is the most frequently diagnosed and leading cause of cancer-related mortality among women worldwide. The danger of this disease is due to its asymptomatic nature in the early stages, thereby underscoring the importance of early detection. Mammography, a specialized X-ray imaging technique for breast examination, has been pivotal in facilitating early detection and reducing mortality rates. In recent years, artificial intelligence (AI) has gained substantial popularity across various fields, including medicine. Numerous studies have leveraged AI techniques, particularly convolutional neural networks (CNNs) and You Only Look Once (YOLO)-based models, for medical image detection and classification. However, the predictions of such AI models often lack transparency and explainability, resulting in low trustworthiness. This study aims to address this gap by investigating three state-of-the-art versions of the YOLO algorithm-YOLO version 9 (YOLOv9), YOLO version 10 (YOLOv10), and YOLO version 11 (YOLO11)-trained on breast cancer imaging datasets, specifically the INbreast and Mammographic Image Analysis Society (MIAS) databases. Additionally, to address the challenges posed by the lack of explainability and transparency, we integrate seven explainable artificial intelligence (XAI) methods: Grad-CAM, Grad-CAM++, Eigen-CAM, EigenGrad-CAM, XGrad-CAM, LayerCAM, and HiResCAM.

Methods: This study utilized two publicly available breast cancer image databases: INbreast: toward a Full-field Digital Mammographic Database and the MIAS dataset. Preprocessing steps were applied to standardize all images in accordance with the input requirements of the YOLO architecture, as these datasets were used to train the three most recent versions of YOLO. The YOLO model demonstrating the highest performance-measured by mean average precision (mAP), precision, and recall-was selected for integration with seven different XAI methods. The performance of each XAI technique was evaluated both qualitatively through visual inspection and quantitatively using several metrics, including matching ground truth (mGT), Pearson correlation coefficient (PCC), precision, recall, and root mean square error (RMSE). These methodologies were employed to interpret and visualize the "black box" decision-making processes of the top-performing YOLO model.

Results: Based on our experimental findings, YOLO11 outperformed YOLOv9 (mAP 0.868) and YOLOv10 (mAP 0.926), achieving the highest mAP of 0.935, with classification accuracies of 95% for benign and 80% for malignant cases. Among the evaluated XAI techniques, HiResCAM provided the most effective visual explanations, attaining the highest mGT score of 0.49, surpassing EigenGrad-CAM (0.45) and LayerCAM (0.42) in both visual and quantitative evaluations.

Conclusions: The integration of YOLO11 with HiResCAM offers a robust solution that combines high detection accuracy with improved model interpretability. This approach not only enhances user trustworthiness by revealing decision-making patterns and limitations but also provide insights into the weaknesses of the model, enabling developers to refine and improve AI performance further.

利用热图可视化分析基于yolo的乳腺癌检测的可解释性。
背景:乳腺癌是世界范围内女性癌症相关死亡的最常见诊断和主要原因。这种疾病的危险是由于其在早期阶段无症状的性质,因此强调了早期发现的重要性。乳房x线照相术是一种专门用于乳房检查的x射线成像技术,在促进早期发现和降低死亡率方面发挥了关键作用。近年来,人工智能(AI)在包括医学在内的各个领域都得到了广泛的普及。许多研究利用人工智能技术,特别是卷积神经网络(cnn)和基于You Only Look Once (YOLO)的模型,用于医学图像检测和分类。然而,这种人工智能模型的预测往往缺乏透明度和可解释性,导致可信度较低。本研究旨在通过研究三个最先进版本的YOLO算法——YOLO版本9 (YOLOv9)、YOLO版本10 (YOLOv10)和YOLO版本11 (YOLO11)来解决这一差距,这些算法都是在乳腺癌成像数据集(特别是INbreast and mammography Image Analysis Society (MIAS)数据库)上训练的。此外,为了解决缺乏可解释性和透明度所带来的挑战,我们整合了七种可解释的人工智能(XAI)方法:Grad-CAM, Grad-CAM++, Eigen-CAM, EigenGrad-CAM, XGrad-CAM, LayerCAM和HiResCAM。方法:本研究利用了两个公开可用的乳腺癌图像数据库:INbreast: towards a Full-field Digital mammography Database和MIAS dataset。根据YOLO架构的输入要求,采用预处理步骤对所有图像进行标准化,因为这些数据集用于训练三个最新版本的YOLO。通过平均平均精度(mAP)、精度和召回率来衡量,YOLO模型表现出最高的性能,并被选中与7种不同的XAI方法进行集成。每种XAI技术的性能通过目视检查进行定性评估,并使用几个指标进行定量评估,包括匹配基础真值(mGT)、Pearson相关系数(PCC)、精度、召回率和均方根误差(RMSE)。这些方法被用来解释和可视化表现最好的YOLO模型的“黑匣子”决策过程。结果:基于我们的实验结果,YOLO11优于YOLOv9 (mAP 0.868)和YOLOv10 (mAP 0.926),达到最高的mAP 0.935,对良性病例的分类准确率为95%,对恶性病例的分类准确率为80%。在评估的XAI技术中,HiResCAM提供了最有效的视觉解释,在视觉和定量评估中获得了最高的mGT得分0.49,超过了eigengrade - cam(0.45)和LayerCAM(0.42)。结论:YOLO11与HiResCAM的集成提供了一个强大的解决方案,结合了高检测精度和改进的模型可解释性。这种方法不仅通过揭示决策模式和局限性来提高用户的可信度,而且还提供了对模型弱点的洞察,使开发人员能够进一步完善和提高人工智能的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Quantitative Imaging in Medicine and Surgery
Quantitative Imaging in Medicine and Surgery Medicine-Radiology, Nuclear Medicine and Imaging
CiteScore
4.20
自引率
17.90%
发文量
252
期刊介绍: Information not localized
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信