Enhancing histopathological image analysis: An explainable vision transformer approach with comprehensive interpretation methods and evaluation of explanation quality
Aqib Nazir Mir , Danish Raza Rizvi , Md Rizwan Ahmad
{"title":"Enhancing histopathological image analysis: An explainable vision transformer approach with comprehensive interpretation methods and evaluation of explanation quality","authors":"Aqib Nazir Mir , Danish Raza Rizvi , Md Rizwan Ahmad","doi":"10.1016/j.engappai.2025.110519","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning models are increasingly reshaping medical imaging, with growing attention on ensuring transparency and trust in their decision-making processes. This study presents the Explainable Vision Transformer (XViT), a model specifically designed for histopathological image analysis. By incorporating advanced interpretability techniques, the XViT model addresses three core aspects: feature learning and classification, generating explainable outputs, and qualitatively evaluating these explanations. Three novel interpretability methods are introduced: attention-based, model-agnostic, and gradient-based, offering diverse perspectives on model behavior. The model's performance and generalizability were rigorously evaluated on two histopathological datasets: lung colon 25000 (LCS25000) with 96.2% accuracy across three classes and Kangbuk Samsung Hospital (KBSMC) with 88.6% accuracy across four classes. XViT provides actionable insights by highlighting diagnostically relevant regions in input images, significantly enhancing clinical trust and decision-making. The evaluation of its explainability methods through metrics like sensitivity, faithfulness, and complexity demonstrated that layer-wise relevance propagation for transformers outperforms standard techniques like local interpretable model-agnostic explanations (LIME) and attention visualization. This robust performance underscores the XViT model's potential to bridge the gap between AI accuracy and interpretability in medical imaging. Our findings emphasize the need for well-defined evaluation criteria when comparing interpretability methods and highlight the model's potential for integration into clinical workflows. This work represents a step forward in creating reliable and interpretable AI solutions, ensuring that the benefits of advanced deep learning models extend seamlessly into practical healthcare settings.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"149 ","pages":"Article 110519"},"PeriodicalIF":8.0000,"publicationDate":"2025-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625005196","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning models are increasingly reshaping medical imaging, with growing attention on ensuring transparency and trust in their decision-making processes. This study presents the Explainable Vision Transformer (XViT), a model specifically designed for histopathological image analysis. By incorporating advanced interpretability techniques, the XViT model addresses three core aspects: feature learning and classification, generating explainable outputs, and qualitatively evaluating these explanations. Three novel interpretability methods are introduced: attention-based, model-agnostic, and gradient-based, offering diverse perspectives on model behavior. The model's performance and generalizability were rigorously evaluated on two histopathological datasets: lung colon 25000 (LCS25000) with 96.2% accuracy across three classes and Kangbuk Samsung Hospital (KBSMC) with 88.6% accuracy across four classes. XViT provides actionable insights by highlighting diagnostically relevant regions in input images, significantly enhancing clinical trust and decision-making. The evaluation of its explainability methods through metrics like sensitivity, faithfulness, and complexity demonstrated that layer-wise relevance propagation for transformers outperforms standard techniques like local interpretable model-agnostic explanations (LIME) and attention visualization. This robust performance underscores the XViT model's potential to bridge the gap between AI accuracy and interpretability in medical imaging. Our findings emphasize the need for well-defined evaluation criteria when comparing interpretability methods and highlight the model's potential for integration into clinical workflows. This work represents a step forward in creating reliable and interpretable AI solutions, ensuring that the benefits of advanced deep learning models extend seamlessly into practical healthcare settings.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.