{"title":"Optimization of vision transformer-based detection of lung diseases from chest X-ray images.","authors":"Jinsol Ko, Soyeon Park, Hyun Goo Woo","doi":"10.1186/s12911-024-02591-3","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Recent advances in Vision Transformer (ViT)-based deep learning have significantly improved the accuracy of lung disease prediction from chest X-ray images. However, limited research exists on comparing the effectiveness of different optimizers for lung disease prediction within ViT models. This study aims to systematically evaluate and compare the performance of various optimization methods for ViT-based models in predicting lung diseases from chest X-ray images.</p><p><strong>Methods: </strong>This study utilized a chest X-ray image dataset comprising 19,003 images containing both normal cases and six lung diseases: COVID-19, Viral Pneumonia, Bacterial Pneumonia, Middle East Respiratory Syndrome (MERS), Severe Acute Respiratory Syndrome (SARS), and Tuberculosis. Each ViT model (ViT, FastViT, and CrossViT) was individually trained with each optimization method (Adam, AdamW, NAdam, RAdam, SGDW, and Momentum) to assess their performance in lung disease prediction.</p><p><strong>Results: </strong>When tested with ViT on the dataset with balanced-sample sized classes, RAdam demonstrated superior accuracy compared to other optimizers, achieving 95.87%. In the dataset with imbalanced sample size, FastViT with NAdam achieved the best performance with an accuracy of 97.63%.</p><p><strong>Conclusions: </strong>We provide comprehensive optimization strategies for developing ViT-based model architectures, which can enhance the performance of these models for lung disease prediction from chest X-ray images.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11232177/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-024-02591-3","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Recent advances in Vision Transformer (ViT)-based deep learning have significantly improved the accuracy of lung disease prediction from chest X-ray images. However, limited research exists on comparing the effectiveness of different optimizers for lung disease prediction within ViT models. This study aims to systematically evaluate and compare the performance of various optimization methods for ViT-based models in predicting lung diseases from chest X-ray images.
Methods: This study utilized a chest X-ray image dataset comprising 19,003 images containing both normal cases and six lung diseases: COVID-19, Viral Pneumonia, Bacterial Pneumonia, Middle East Respiratory Syndrome (MERS), Severe Acute Respiratory Syndrome (SARS), and Tuberculosis. Each ViT model (ViT, FastViT, and CrossViT) was individually trained with each optimization method (Adam, AdamW, NAdam, RAdam, SGDW, and Momentum) to assess their performance in lung disease prediction.
Results: When tested with ViT on the dataset with balanced-sample sized classes, RAdam demonstrated superior accuracy compared to other optimizers, achieving 95.87%. In the dataset with imbalanced sample size, FastViT with NAdam achieved the best performance with an accuracy of 97.63%.
Conclusions: We provide comprehensive optimization strategies for developing ViT-based model architectures, which can enhance the performance of these models for lung disease prediction from chest X-ray images.
背景:基于视觉变换器(ViT)的深度学习的最新进展大大提高了从胸部 X 光图像预测肺部疾病的准确性。然而,在 ViT 模型中比较不同优化器对肺部疾病预测效果的研究还很有限。本研究旨在系统评估和比较基于 ViT 模型的各种优化方法在从胸部 X 光图像预测肺部疾病方面的性能:本研究使用了一个胸部 X 光图像数据集,该数据集由 19,003 张图像组成,包含正常病例和六种肺部疾病:COVID-19、病毒性肺炎、细菌性肺炎、中东呼吸综合征 (MERS)、严重急性呼吸系统综合征 (SARS) 和肺结核。每个 ViT 模型(ViT、FastViT 和 CrossViT)都用每种优化方法(Adam、AdamW、NAdam、RAdam、SGDW 和 Momentum)进行了单独训练,以评估它们在肺病预测中的性能:在具有平衡样本大小类别的数据集上使用 ViT 进行测试时,RAdam 与其他优化器相比表现出更高的准确率,达到 95.87%。在样本量不平衡的数据集上,FastViT 和 NAdam 的准确率达到了 97.63%,表现最佳:我们为开发基于 ViT 的模型架构提供了全面的优化策略,这些策略可以提高这些模型的性能,从而从胸部 X 光图像中预测肺部疾病。