Enhanced tuberculosis detection using Vision Transformers and explainable AI with a Grad-CAM approach on chest X-rays.

IF 2.9 3区 医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
K Vanitha, T R Mahesh, V Vinoth Kumar, Suresh Guluwadi
{"title":"Enhanced tuberculosis detection using Vision Transformers and explainable AI with a Grad-CAM approach on chest X-rays.","authors":"K Vanitha, T R Mahesh, V Vinoth Kumar, Suresh Guluwadi","doi":"10.1186/s12880-025-01630-3","DOIUrl":null,"url":null,"abstract":"<p><p>Tuberculosis (TB), caused by Mycobacterium tuberculosis, remains a leading global health challenge, especially in low-resource settings. Accurate diagnosis from chest X-rays is critical yet challenging due to subtle manifestations of TB, particularly in its early stages. Traditional computational methods, primarily using basic convolutional neural networks (CNNs), often require extensive pre-processing and struggle with generalizability across diverse clinical environments. This study introduces a novel Vision Transformer (ViT) model augmented with Gradient-weighted Class Activation Mapping (Grad-CAM) to enhance both diagnostic accuracy and interpretability. The ViT model utilizes self-attention mechanisms to extract long-range dependencies and complex patterns directly from the raw pixel information, whereas Grad-CAM offers visual explanations of model decisions about highlighting significant regions in the X-rays. The model contains a Conv2D stem for initial feature extraction, followed by many transformer encoder blocks, thereby significantly boosting its ability to learn discriminative features without any pre-processing. Performance testing on a validation set had an accuracy of 0.97, recall of 0.99, and F1-score of 0.98 for TB patients. On the test set, the model has accuracy of 0.98, recall of 0.97, and F1-score of 0.98, which is better than existing methods. The addition of Grad-CAM visuals not only improves the transparency of the model but also assists radiologists in assessing and verifying AI-driven diagnoses. These results demonstrate the model's higher diagnostic precision and potential for clinical application in real-world settings, providing a massive improvement in the automated detection of TB.</p>","PeriodicalId":9020,"journal":{"name":"BMC Medical Imaging","volume":"25 1","pages":"96"},"PeriodicalIF":2.9000,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11934573/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12880-025-01630-3","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Tuberculosis (TB), caused by Mycobacterium tuberculosis, remains a leading global health challenge, especially in low-resource settings. Accurate diagnosis from chest X-rays is critical yet challenging due to subtle manifestations of TB, particularly in its early stages. Traditional computational methods, primarily using basic convolutional neural networks (CNNs), often require extensive pre-processing and struggle with generalizability across diverse clinical environments. This study introduces a novel Vision Transformer (ViT) model augmented with Gradient-weighted Class Activation Mapping (Grad-CAM) to enhance both diagnostic accuracy and interpretability. The ViT model utilizes self-attention mechanisms to extract long-range dependencies and complex patterns directly from the raw pixel information, whereas Grad-CAM offers visual explanations of model decisions about highlighting significant regions in the X-rays. The model contains a Conv2D stem for initial feature extraction, followed by many transformer encoder blocks, thereby significantly boosting its ability to learn discriminative features without any pre-processing. Performance testing on a validation set had an accuracy of 0.97, recall of 0.99, and F1-score of 0.98 for TB patients. On the test set, the model has accuracy of 0.98, recall of 0.97, and F1-score of 0.98, which is better than existing methods. The addition of Grad-CAM visuals not only improves the transparency of the model but also assists radiologists in assessing and verifying AI-driven diagnoses. These results demonstrate the model's higher diagnostic precision and potential for clinical application in real-world settings, providing a massive improvement in the automated detection of TB.

使用视觉变压器和可解释的AI在胸部x光片上使用Grad-CAM方法增强结核病检测。
由结核分枝杆菌引起的结核病仍然是一个主要的全球卫生挑战,特别是在资源匮乏的环境中。胸部x光片的准确诊断至关重要,但由于结核病的细微表现,特别是在其早期阶段,因此具有挑战性。传统的计算方法,主要使用基本卷积神经网络(cnn),通常需要大量的预处理,并且难以在不同的临床环境中泛化。本研究引入了一种新的视觉变压器(ViT)模型,增强了梯度加权类激活映射(Grad-CAM),以提高诊断的准确性和可解释性。ViT模型利用自注意机制直接从原始像素信息中提取远程依赖关系和复杂模式,而Grad-CAM则提供了关于突出x射线中重要区域的模型决策的可视化解释。该模型包含一个用于初始特征提取的Conv2D stem,然后是许多变压器编码器块,从而大大提高了其无需任何预处理即可学习判别特征的能力。在验证集上进行性能测试,结核病患者的准确率为0.97,召回率为0.99,f1评分为0.98。在测试集上,该模型的准确率为0.98,召回率为0.97,f1得分为0.98,优于现有的方法。添加Grad-CAM视觉效果不仅提高了模型的透明度,而且还有助于放射科医生评估和验证人工智能驱动的诊断。这些结果表明该模型具有更高的诊断精度和在现实环境中临床应用的潜力,为结核病的自动检测提供了巨大的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
BMC Medical Imaging
BMC Medical Imaging RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-
CiteScore
4.60
自引率
3.70%
发文量
198
审稿时长
27 weeks
期刊介绍: BMC Medical Imaging is an open access journal publishing original peer-reviewed research articles in the development, evaluation, and use of imaging techniques and image processing tools to diagnose and manage disease.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信