增强的基于CoAtNet的混合深度学习架构，用于在人体胸部x射线中自动检测结核病。

IF 3.2 3区医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

BMC Medical Imaging Pub Date : 2025-09-26 DOI:10.1186/s12880-025-01901-z

Gunjan Siddharth, Ananya Ambekar, Naveenkumar Jayakumar

{"title":"增强的基于CoAtNet的混合深度学习架构，用于在人体胸部x射线中自动检测结核病。","authors":"Gunjan Siddharth, Ananya Ambekar, Naveenkumar Jayakumar","doi":"10.1186/s12880-025-01901-z","DOIUrl":null,"url":null,"abstract":"Tuberculosis (TB) is a serious infectious disease that remains a global health challenge. While chest X-rays (CXRs) are widely used for TB detection, manual interpretation can be subjective and time-consuming. Automated classification of CXRs into TB and non-TB cases can significantly support healthcare professionals in timely and accurate diagnosis. This paper introduces a hybrid deep learning approach for classifying CXR images. The solution is based on the CoAtNet framework, which combines the strengths of Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). The model is pre-trained on the large-scale ImageNet dataset to ensure robust generalization across diverse images. The evaluation is conducted on the IN-CXR tuberculosis dataset from ICMR-NIRT, which contains a comprehensive collection of CXR images of both normal and abnormal categories. The hybrid model achieves a binary classification accuracy of 86.39% and an ROC-AUC score of 93.79%, outperforming tested baseline models that rely exclusively on either CNNs or ViTs when trained on this dataset. Furthermore, the integration of Local Interpretable Model-agnostic Explanations (LIME) enhances the interpretability of the model's predictions. This combination of reliable performance and transparent, interpretable results strengthens the model's role in AI-driven medical imaging research. Code will be made available upon request.","PeriodicalId":9020,"journal":{"name":"BMC Medical Imaging","volume":"25 1","pages":"379"},"PeriodicalIF":3.2000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12465491/pdf/","citationCount":"0","resultStr":"{\"title\":\"Enhanced CoAtNet based hybrid deep learning architecture for automated tuberculosis detection in human chest X-rays.\",\"authors\":\"Gunjan Siddharth, Ananya Ambekar, Naveenkumar Jayakumar\",\"doi\":\"10.1186/s12880-025-01901-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Tuberculosis (TB) is a serious infectious disease that remains a global health challenge. While chest X-rays (CXRs) are widely used for TB detection, manual interpretation can be subjective and time-consuming. Automated classification of CXRs into TB and non-TB cases can significantly support healthcare professionals in timely and accurate diagnosis. This paper introduces a hybrid deep learning approach for classifying CXR images. The solution is based on the CoAtNet framework, which combines the strengths of Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). The model is pre-trained on the large-scale ImageNet dataset to ensure robust generalization across diverse images. The evaluation is conducted on the IN-CXR tuberculosis dataset from ICMR-NIRT, which contains a comprehensive collection of CXR images of both normal and abnormal categories. The hybrid model achieves a binary classification accuracy of 86.39% and an ROC-AUC score of 93.79%, outperforming tested baseline models that rely exclusively on either CNNs or ViTs when trained on this dataset. Furthermore, the integration of Local Interpretable Model-agnostic Explanations (LIME) enhances the interpretability of the model's predictions. This combination of reliable performance and transparent, interpretable results strengthens the model's role in AI-driven medical imaging research. Code will be made available upon request.\",\"PeriodicalId\":9020,\"journal\":{\"name\":\"BMC Medical Imaging\",\"volume\":\"25 1\",\"pages\":\"379\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12465491/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Imaging\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12880-025-01901-z\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12880-025-01901-z","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

摘要

结核病是一种严重的传染病，仍然是一项全球卫生挑战。虽然胸部x射线（cxr）被广泛用于结核病检测，但人工解释可能是主观的且耗时的。将cxr自动分类为结核病和非结核病病例可以极大地支持医疗保健专业人员及时准确地诊断。本文介绍了一种用于CXR图像分类的混合深度学习方法。该解决方案基于CoAtNet框架，该框架结合了卷积神经网络（cnn）和视觉变压器（ViTs）的优势。该模型在大规模ImageNet数据集上进行预训练，以确保跨不同图像的鲁棒泛化。评估是在ICMR-NIRT的IN-CXR结核数据集上进行的，该数据集包含正常和异常类别的CXR图像的综合集合。混合模型的二元分类准确率为86.39%，ROC-AUC得分为93.79%，在该数据集上训练时优于仅依赖cnn或ViTs的测试基线模型。此外，局部可解释模型不可知论解释（LIME）的集成提高了模型预测的可解释性。这种可靠的性能和透明、可解释的结果的结合，加强了该模型在人工智能驱动的医学成像研究中的作用。代码将根据要求提供。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Enhanced CoAtNet based hybrid deep learning architecture for automated tuberculosis detection in human chest X-rays.

Tuberculosis (TB) is a serious infectious disease that remains a global health challenge. While chest X-rays (CXRs) are widely used for TB detection, manual interpretation can be subjective and time-consuming. Automated classification of CXRs into TB and non-TB cases can significantly support healthcare professionals in timely and accurate diagnosis. This paper introduces a hybrid deep learning approach for classifying CXR images. The solution is based on the CoAtNet framework, which combines the strengths of Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). The model is pre-trained on the large-scale ImageNet dataset to ensure robust generalization across diverse images. The evaluation is conducted on the IN-CXR tuberculosis dataset from ICMR-NIRT, which contains a comprehensive collection of CXR images of both normal and abnormal categories. The hybrid model achieves a binary classification accuracy of 86.39% and an ROC-AUC score of 93.79%, outperforming tested baseline models that rely exclusively on either CNNs or ViTs when trained on this dataset. Furthermore, the integration of Local Interpretable Model-agnostic Explanations (LIME) enhances the interpretability of the model's predictions. This combination of reliable performance and transparent, interpretable results strengthens the model's role in AI-driven medical imaging research. Code will be made available upon request.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BMC Medical Imaging RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-

CiteScore

4.60

自引率

3.70%

发文量

198

审稿时长

27 weeks

期刊介绍： BMC Medical Imaging is an open access journal publishing original peer-reviewed research articles in the development, evaluation, and use of imaging techniques and image processing tools to diagnose and manage disease.