An explainable AI for breast cancer classification using vision Transformer (ViT)

IF 4.9 2区医学 Q1 ENGINEERING, BIOMEDICAL

Biomedical Signal Processing and Control Pub Date : 2025-05-02 DOI:10.1016/j.bspc.2025.108011

Marwa Naas , Hiba Mzoughi , Ines Njeh , Mohamed BenSlima

{"title":"An explainable AI for breast cancer classification using vision Transformer (ViT)","authors":"Marwa Naas , Hiba Mzoughi , Ines Njeh , Mohamed BenSlima","doi":"10.1016/j.bspc.2025.108011","DOIUrl":null,"url":null,"abstract":"<div><div>Manual classification of breast cancer (BC) through an optical microscope is regarded as an essential task throughout clinical routines, necessitating highly skilled pathologists. Computer-aided diagnosis (CAD) techniques based on deep learning (DL) are developed to assist the pathologists in making diagnostic decisions. Nevertheless, the black-box nature and the absence of interpretability and transparency of these DL-based models render their application highly difficult in sensitive and critical medical applications. In addition to providing explanations for the model predictions, explainable artificial intelligence (XAI) strategies help to gain the trust of clinicians. The current Convolutional Neural Network (CNN) architectures have limitations in capturing the global feature information details present in BC histopathological images. To overcome the challenge of long-range dependenciesin CNN-based models, Vision Transformer (ViT) architectures have recently been created.</div><div>These architectures have a self-attention mechanism that enables the analysis of images. As a result, the network is able to record the deep long-range dependence between pixels. The present work aims to develop an effective CAD tool for BC classification. In this study, we investigated a deep ViT architecture trained to perform binary lesions classification (malignant versus benign) using histopathology images. Various XAI techniques have been implemented: Gradient-Weighted Class Activation Mapping (Grad-CAM), Vanilla gradient, Integrated gradients, Saliency Maps, Local Interpretable Model Agnostic Explanation (LIME), and Attention Maps to highlight the most important features of the model prediction outcomes. The evaluation task was performed using the publicly accessible benchmark dataset BreakHis. Based on the research outcomes, our suggested ViT architecture demonstrates competitive performance, surpassing state-of-the-art CNN models in the analysis of histopathological images. Furthermore, the proposed models provide precise and accurate interpretations, reinforcing their reliability. Therefore, we can affirm that the proposed CAD system can be effectively integrated into clinical diagnostic routines, offering enhanced support for medical professionals.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"108 ","pages":"Article 108011"},"PeriodicalIF":4.9000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Signal Processing and Control","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1746809425005221","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Manual classification of breast cancer (BC) through an optical microscope is regarded as an essential task throughout clinical routines, necessitating highly skilled pathologists. Computer-aided diagnosis (CAD) techniques based on deep learning (DL) are developed to assist the pathologists in making diagnostic decisions. Nevertheless, the black-box nature and the absence of interpretability and transparency of these DL-based models render their application highly difficult in sensitive and critical medical applications. In addition to providing explanations for the model predictions, explainable artificial intelligence (XAI) strategies help to gain the trust of clinicians. The current Convolutional Neural Network (CNN) architectures have limitations in capturing the global feature information details present in BC histopathological images. To overcome the challenge of long-range dependenciesin CNN-based models, Vision Transformer (ViT) architectures have recently been created.

These architectures have a self-attention mechanism that enables the analysis of images. As a result, the network is able to record the deep long-range dependence between pixels. The present work aims to develop an effective CAD tool for BC classification. In this study, we investigated a deep ViT architecture trained to perform binary lesions classification (malignant versus benign) using histopathology images. Various XAI techniques have been implemented: Gradient-Weighted Class Activation Mapping (Grad-CAM), Vanilla gradient, Integrated gradients, Saliency Maps, Local Interpretable Model Agnostic Explanation (LIME), and Attention Maps to highlight the most important features of the model prediction outcomes. The evaluation task was performed using the publicly accessible benchmark dataset BreakHis. Based on the research outcomes, our suggested ViT architecture demonstrates competitive performance, surpassing state-of-the-art CNN models in the analysis of histopathological images. Furthermore, the proposed models provide precise and accurate interpretations, reinforcing their reliability. Therefore, we can affirm that the proposed CAD system can be effectively integrated into clinical diagnostic routines, offering enhanced support for medical professionals.

查看原文本刊更多论文

基于视觉变压器（vision Transformer, ViT）的可解释乳腺癌分类人工智能

通过光学显微镜对乳腺癌（BC）进行人工分类被认为是贯穿临床常规的一项基本任务，需要高技能的病理学家。基于深度学习（DL）的计算机辅助诊断（CAD）技术是为了帮助病理学家做出诊断决策而开发的。然而，这些基于dl的模型的黑箱性质以及缺乏可解释性和透明度使得它们在敏感和关键的医疗应用中应用非常困难。除了为模型预测提供解释外，可解释的人工智能（XAI）策略有助于获得临床医生的信任。当前的卷积神经网络（CNN）架构在捕获BC组织病理图像中的全局特征信息细节方面存在局限性。为了克服基于cnn模型的远程依赖的挑战，最近创建了视觉变压器（ViT）架构。这些体系结构具有支持图像分析的自关注机制。因此，该网络能够记录像素之间的深度远程依赖关系。本工作旨在开发一种有效的BC分类CAD工具。在这项研究中，我们研究了一个深度ViT架构，训练它使用组织病理学图像进行二元病变分类（恶性与良性）。已经实现了各种XAI技术：梯度加权类激活映射（Grad-CAM）、香草梯度、集成梯度、显著性图、局部可解释模型不可知解释（LIME）和注意图，以突出模型预测结果的最重要特征。评估任务是使用可公开访问的基准数据集BreakHis执行的。基于研究结果，我们建议的ViT架构显示出具有竞争力的性能，在组织病理图像分析方面超过了最先进的CNN模型。此外，所提出的模型提供了精确和准确的解释，增强了它们的可靠性。因此，我们可以肯定，拟议的CAD系统可以有效地整合到临床诊断程序中，为医疗专业人员提供更好的支持。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Biomedical Signal Processing and Control 工程技术-工程：生物医学

CiteScore

9.80

自引率

13.70%

发文量

822

审稿时长

4 months

期刊介绍： Biomedical Signal Processing and Control aims to provide a cross-disciplinary international forum for the interchange of information on research in the measurement and analysis of signals and images in clinical medicine and the biological sciences. Emphasis is placed on contributions dealing with the practical, applications-led research on the use of methods and devices in clinical diagnosis, patient monitoring and management. Biomedical Signal Processing and Control reflects the main areas in which these methods are being used and developed at the interface of both engineering and clinical science. The scope of the journal is defined to include relevant review papers, technical notes, short communications and letters. Tutorial papers and special issues will also be published.