通过迁移学习和深度学习提高胸部x线图像的结核病检测：卷积神经网络架构的比较研究。

JMIRx med Pub Date : 2025-07-01 DOI:10.2196/66029

Alex Mirugwe, Lillian Tamale, Juwa Nyirenda

{"title":"通过迁移学习和深度学习提高胸部x线图像的结核病检测：卷积神经网络架构的比较研究。","authors":"Alex Mirugwe, Lillian Tamale, Juwa Nyirenda","doi":"10.2196/66029","DOIUrl":null,"url":null,"abstract":"Background: Tuberculosis (TB) remains a significant global health challenge, as current diagnostic methods are often resource-intensive, time-consuming, and inaccessible in many high-burden communities, necessitating more efficient and accurate diagnostic methods to improve early detection and treatment outcomes.Objective: This study aimed to evaluate the performance of 6 convolutional neural network architectures-Visual Geometry Group-16 (VGG16), VGG19, Residual Network-50 (ResNet50), ResNet101, ResNet152, and Inception-ResNet-V2-in classifying chest x-ray (CXR) images as either normal or TB-positive. The impact of data augmentation on model performance, training times, and parameter counts was also assessed.Methods: The dataset of 4200 CXR images, comprising 700 labeled as TB-positive and 3500 as normal cases, was used to train and test the models. Evaluation metrics included accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve. The computational efficiency of each model was analyzed by comparing training times and parameter counts.Results: VGG16 outperformed the other architectures, achieving an accuracy of 99.4%, precision of 97.9%, recall of 98.6%, F1-score of 98.3%, and area under the receiver operating characteristic curve of 98.25%. This superior performance is significant because it demonstrates that a simpler model can deliver exceptional diagnostic accuracy while requiring fewer computational resources. Surprisingly, data augmentation did not improve performance, suggesting that the original dataset's diversity was sufficient. Models with large numbers of parameters, such as ResNet152 and Inception-ResNet-V2, required longer training times without yielding proportionally better performance.Conclusions: Simpler models like VGG16 offer a favorable balance between diagnostic accuracy and computational efficiency for TB detection in CXR images. These findings highlight the need to tailor model selection to task-specific requirements, providing valuable insights for future research and clinical implementations in medical image classification.","PeriodicalId":73558,"journal":{"name":"JMIRx med","volume":"6 ","pages":"e66029"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12236266/pdf/","citationCount":"0","resultStr":"{\"title\":\"Improving Tuberculosis Detection in Chest X-Ray Images Through Transfer Learning and Deep Learning: Comparative Study of Convolutional Neural Network Architectures.\",\"authors\":\"Alex Mirugwe, Lillian Tamale, Juwa Nyirenda\",\"doi\":\"10.2196/66029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Tuberculosis (TB) remains a significant global health challenge, as current diagnostic methods are often resource-intensive, time-consuming, and inaccessible in many high-burden communities, necessitating more efficient and accurate diagnostic methods to improve early detection and treatment outcomes.Objective: This study aimed to evaluate the performance of 6 convolutional neural network architectures-Visual Geometry Group-16 (VGG16), VGG19, Residual Network-50 (ResNet50), ResNet101, ResNet152, and Inception-ResNet-V2-in classifying chest x-ray (CXR) images as either normal or TB-positive. The impact of data augmentation on model performance, training times, and parameter counts was also assessed.Methods: The dataset of 4200 CXR images, comprising 700 labeled as TB-positive and 3500 as normal cases, was used to train and test the models. Evaluation metrics included accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve. The computational efficiency of each model was analyzed by comparing training times and parameter counts.Results: VGG16 outperformed the other architectures, achieving an accuracy of 99.4%, precision of 97.9%, recall of 98.6%, F1-score of 98.3%, and area under the receiver operating characteristic curve of 98.25%. This superior performance is significant because it demonstrates that a simpler model can deliver exceptional diagnostic accuracy while requiring fewer computational resources. Surprisingly, data augmentation did not improve performance, suggesting that the original dataset's diversity was sufficient. Models with large numbers of parameters, such as ResNet152 and Inception-ResNet-V2, required longer training times without yielding proportionally better performance.Conclusions: Simpler models like VGG16 offer a favorable balance between diagnostic accuracy and computational efficiency for TB detection in CXR images. These findings highlight the need to tailor model selection to task-specific requirements, providing valuable insights for future research and clinical implementations in medical image classification.\",\"PeriodicalId\":73558,\"journal\":{\"name\":\"JMIRx med\",\"volume\":\"6 \",\"pages\":\"e66029\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12236266/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIRx med\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/66029\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIRx med","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/66029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

背景：结核病（TB）仍然是一项重大的全球卫生挑战，因为目前的诊断方法往往是资源密集型的，耗时的，并且在许多高负担社区无法获得，因此需要更有效和准确的诊断方法来改善早期发现和治疗结果。目的：本研究旨在评估6种卷积神经网络架构——视觉几何组-16 （VGG16）、VGG19、残余网络-50 （ResNet50）、ResNet101、ResNet152和inception - resnet - v2——在胸部x射线（CXR）图像正常或结核阳性分类中的性能。还评估了数据增强对模型性能、训练时间和参数计数的影响。方法：使用4200张CXR图像数据集，其中700张标记为结核阳性，3500张标记为正常病例，对模型进行训练和测试。评估指标包括准确度、精密度、召回率、f1评分和受试者工作特征曲线下面积。通过比较训练次数和参数个数，分析各模型的计算效率。结果：VGG16的准确率为99.4%，精密度为97.9%，召回率为98.6%，f1评分为98.3%，接收者工作特征曲线下面积为98.25%，优于其他架构。这种卓越的性能非常重要，因为它证明了一个更简单的模型可以在需要更少的计算资源的情况下提供出色的诊断准确性。令人惊讶的是，数据增强并没有提高性能，这表明原始数据集的多样性是足够的。具有大量参数的模型，如ResNet152和Inception-ResNet-V2，需要更长的训练时间，而不能产生成比例的更好的性能。结论：VGG16等更简单的模型在CXR图像结核检测的诊断准确性和计算效率之间取得了良好的平衡。这些发现强调了根据任务特定要求定制模型选择的必要性，为医学图像分类的未来研究和临床实施提供了有价值的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving Tuberculosis Detection in Chest X-Ray Images Through Transfer Learning and Deep Learning: Comparative Study of Convolutional Neural Network Architectures.

Background: Tuberculosis (TB) remains a significant global health challenge, as current diagnostic methods are often resource-intensive, time-consuming, and inaccessible in many high-burden communities, necessitating more efficient and accurate diagnostic methods to improve early detection and treatment outcomes.

Objective: This study aimed to evaluate the performance of 6 convolutional neural network architectures-Visual Geometry Group-16 (VGG16), VGG19, Residual Network-50 (ResNet50), ResNet101, ResNet152, and Inception-ResNet-V2-in classifying chest x-ray (CXR) images as either normal or TB-positive. The impact of data augmentation on model performance, training times, and parameter counts was also assessed.

Methods: The dataset of 4200 CXR images, comprising 700 labeled as TB-positive and 3500 as normal cases, was used to train and test the models. Evaluation metrics included accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve. The computational efficiency of each model was analyzed by comparing training times and parameter counts.

Results: VGG16 outperformed the other architectures, achieving an accuracy of 99.4%, precision of 97.9%, recall of 98.6%, F1-score of 98.3%, and area under the receiver operating characteristic curve of 98.25%. This superior performance is significant because it demonstrates that a simpler model can deliver exceptional diagnostic accuracy while requiring fewer computational resources. Surprisingly, data augmentation did not improve performance, suggesting that the original dataset's diversity was sufficient. Models with large numbers of parameters, such as ResNet152 and Inception-ResNet-V2, required longer training times without yielding proportionally better performance.

Conclusions: Simpler models like VGG16 offer a favorable balance between diagnostic accuracy and computational efficiency for TB detection in CXR images. These findings highlight the need to tailor model selection to task-specific requirements, providing valuable insights for future research and clinical implementations in medical image classification.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JMIRx med

自引率

0.00%

发文量