A Study of Vision Transformer for Lung Diseases Classification

M. Nguyen, Khai Ngo Quang
{"title":"A Study of Vision Transformer for Lung Diseases Classification","authors":"M. Nguyen, Khai Ngo Quang","doi":"10.1109/GTSD54989.2022.9989100","DOIUrl":null,"url":null,"abstract":"Transformer models have gained much success in natural language processing. In the computer vision field, transformer-based backbones recently compete with CNN-based backbones in many tasks. The success of transformer-based backbones relies on a pre-trained model that is trained on huge datasets. However, the requirement may not be satisfied in medical image applications. Compared to ImageNet 21K dataset, medical image datasets are very limited. Therefore, in this paper, we discover the performance of the Vision Transformer on medical image classification. The vision transformer will be fine-tuned on well-known medical datasets. Later, it will be fine-tuned again on the VinDr-CXR dataset. Comprehensive experiments show that the proposed method is slightly better than conventional convolution-based methods in terms of classification accuracy. However, in terms of model interpretability, ViT based models can handle the co-occurrence of multi-diseases in a medical image.","PeriodicalId":125445,"journal":{"name":"2022 6th International Conference on Green Technology and Sustainable Development (GTSD)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 6th International Conference on Green Technology and Sustainable Development (GTSD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GTSD54989.2022.9989100","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Transformer models have gained much success in natural language processing. In the computer vision field, transformer-based backbones recently compete with CNN-based backbones in many tasks. The success of transformer-based backbones relies on a pre-trained model that is trained on huge datasets. However, the requirement may not be satisfied in medical image applications. Compared to ImageNet 21K dataset, medical image datasets are very limited. Therefore, in this paper, we discover the performance of the Vision Transformer on medical image classification. The vision transformer will be fine-tuned on well-known medical datasets. Later, it will be fine-tuned again on the VinDr-CXR dataset. Comprehensive experiments show that the proposed method is slightly better than conventional convolution-based methods in terms of classification accuracy. However, in terms of model interpretability, ViT based models can handle the co-occurrence of multi-diseases in a medical image.
视觉变换器用于肺部疾病分类的研究
变形模型在自然语言处理中取得了很大的成功。在计算机视觉领域,基于变压器的主干网在许多任务上与基于cnn的主干网展开了竞争。基于转换器的骨干网的成功依赖于一个预先训练的模型,该模型是在庞大的数据集上训练的。然而,在医学图像应用中,这一要求可能无法得到满足。与ImageNet 21K数据集相比,医学图像数据集非常有限。因此,在本文中,我们发现了视觉转换器在医学图像分类上的性能。视觉转换器将根据知名的医疗数据集进行微调。稍后,将在vdr - cxr数据集上再次对其进行微调。综合实验表明,该方法在分类精度上略优于传统的基于卷积的方法。然而,在模型可解释性方面,基于ViT的模型可以处理医学图像中多种疾病的共现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信