A Comparative Study of Explainability Methods for Whole Slide Classification of Lymph Node Metastases using Vision Transformers

medRxiv - Pathology Pub Date : 2024-05-07 DOI:10.1101/2024.05.07.24306815

Jens Rahnfeld, Mehdi Naouar, Gabriel Kalweit, Joschka Boedecker, Estelle Dubruc, Maria Kalweit

{"title":"A Comparative Study of Explainability Methods for Whole Slide Classification of Lymph Node Metastases using Vision Transformers","authors":"Jens Rahnfeld, Mehdi Naouar, Gabriel Kalweit, Joschka Boedecker, Estelle Dubruc, Maria Kalweit","doi":"10.1101/2024.05.07.24306815","DOIUrl":null,"url":null,"abstract":"Recent advancements in deep learning (DL), such as transformer networks, have shown promise in enhancing the performance of medical image analysis. In pathology, automated whole slide imaging (WSI) has transformed clinical workflows by streamlining routine tasks and diagnostic and prognostic support. However, the lack of transparency of DL models, often described as “black boxes”, poses a significant barrier to their clinical adoption. This necessitates the use of explainable AI methods (xAI) to clarify the decision-making processes of the models. Heatmaps can provide clinicians visual representations that highlight areas of interest or concern for the prediction of the specific model. Generating them from deep neural networks, especially from vision transformers, is non-trivial, as typically their self-attention mechanisms can lead to overconfident artifacts. The aim of this work is to evaluate current xAI methods for transformer models in order to assess which yields the best heatmaps in the histopathological context. Our study undertakes a comparative analysis for classifying a publicly available dataset comprising of N=400 WSIs of lymph node metastases of breast cancer patients. Our findings indicate that heatmaps calculated from Attention Rollout and Integrated Gradients are limited by artifacts and in quantitative performance. In contrast, removal-based methods like RISE and ViT-Shapley exhibit better qualitative attribution maps, showing better results in the well-known interpretability metrics for insertion and deletion. In addition, ViT-Shapley shows faster runtime and the most promising, reliable and practical heatmaps. Incorporating the heatmaps generated from approximate Shapley values in pathology reports could help to integrate xAI in the clinical workflow and increase trust in a scalable manner.","PeriodicalId":501528,"journal":{"name":"medRxiv - Pathology","volume":"44 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Pathology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.05.07.24306815","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Recent advancements in deep learning (DL), such as transformer networks, have shown promise in enhancing the performance of medical image analysis. In pathology, automated whole slide imaging (WSI) has transformed clinical workflows by streamlining routine tasks and diagnostic and prognostic support. However, the lack of transparency of DL models, often described as “black boxes”, poses a significant barrier to their clinical adoption. This necessitates the use of explainable AI methods (xAI) to clarify the decision-making processes of the models. Heatmaps can provide clinicians visual representations that highlight areas of interest or concern for the prediction of the specific model. Generating them from deep neural networks, especially from vision transformers, is non-trivial, as typically their self-attention mechanisms can lead to overconfident artifacts. The aim of this work is to evaluate current xAI methods for transformer models in order to assess which yields the best heatmaps in the histopathological context. Our study undertakes a comparative analysis for classifying a publicly available dataset comprising of N=400 WSIs of lymph node metastases of breast cancer patients. Our findings indicate that heatmaps calculated from Attention Rollout and Integrated Gradients are limited by artifacts and in quantitative performance. In contrast, removal-based methods like RISE and ViT-Shapley exhibit better qualitative attribution maps, showing better results in the well-known interpretability metrics for insertion and deletion. In addition, ViT-Shapley shows faster runtime and the most promising, reliable and practical heatmaps. Incorporating the heatmaps generated from approximate Shapley values in pathology reports could help to integrate xAI in the clinical workflow and increase trust in a scalable manner.

查看原文本刊更多论文

使用视觉变换器对淋巴结转移进行全切片分类的可解释性方法比较研究

变压器网络等深度学习（DL）领域的最新进展已显示出提升医学图像分析性能的前景。在病理学领域，自动全切片成像（WSI）通过简化常规任务以及诊断和预后支持，改变了临床工作流程。然而，DL 模型缺乏透明度，通常被描述为 "黑盒子"，这对其临床应用构成了重大障碍。这就需要使用可解释的人工智能方法（xAI）来阐明模型的决策过程。热图可以为临床医生提供可视化表示，突出特定模型预测的兴趣或关注领域。从深度神经网络，尤其是视觉转换器中生成热图并非易事，因为它们的自我注意机制通常会导致过度自信的假象。这项工作的目的是评估目前用于变换器模型的 xAI 方法，以评估哪种方法能在组织病理学背景下生成最佳热图。我们的研究对一个公开可用的数据集进行了比较分析，该数据集由 N=400 个乳腺癌患者淋巴结转移的 WSIs 组成。我们的研究结果表明，通过注意力滚动和综合梯度计算出的热图受到伪影和定量性能的限制。相比之下，RISE 和 ViT-Shapley 等基于移除的方法在定性归因图方面表现更好，在众所周知的插入和删除的可解释性指标方面显示出更好的结果。此外，ViT-Shapley 的运行时间更快，热图也最有前景、最可靠和最实用。将近似夏普利值生成的热图纳入病理报告有助于将 xAI 集成到临床工作流程中，并以可扩展的方式提高信任度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

medRxiv - Pathology

自引率

0.00%

发文量