Benchmarking and Boosting Transformers for Medical Image Classification.

Domain adaptation and representation transfer : 4th MICCAI Workshop, DART 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings. Domain Adaptation and Representation Transfer (Workshop) (4th : 2022 : Sin... Pub Date : 2022-09-01 Epub Date: 2022-09-15 DOI:10.1007/978-3-031-16852-9_2

DongAo Ma, Mohammad Reza Hosseinzadeh Taher, Jiaxuan Pang, Nahid Ui Islam, Fatemeh Haghighi, Michael B Gotway, Jianming Liang

{"title":"Benchmarking and Boosting Transformers for Medical Image Classification.","authors":"DongAo Ma, Mohammad Reza Hosseinzadeh Taher, Jiaxuan Pang, Nahid Ui Islam, Fatemeh Haghighi, Michael B Gotway, Jianming Liang","doi":"10.1007/978-3-031-16852-9_2","DOIUrl":null,"url":null,"abstract":"<p><p>Visual transformers have recently gained popularity in the computer vision community as they began to outrank convolutional neural networks (CNNs) in one representative visual benchmark after another. However, the competition between visual transformers and CNNs in medical imaging is rarely studied, leaving many important questions unanswered. As the first step, we benchmark how well existing transformer variants that use various (supervised and self-supervised) pre-training methods perform against CNNs on a variety of medical classification tasks. Furthermore, given the data-hungry nature of transformers and the annotation-deficiency challenge of medical imaging, we present a practical approach for bridging the domain gap between photographic and medical images by utilizing unlabeled large-scale in-domain data. Our extensive empirical evaluations reveal the following insights in medical imaging: (1) good initialization is more crucial for transformer-based models than for CNNs, (2) self-supervised learning based on masked image modeling captures more generalizable representations than supervised models, and (3) assembling a larger-scale domain-specific dataset can better bridge the domain gap between photographic and medical images via self-supervised continuous pre-training. We hope this benchmark study can direct future research on applying transformers to medical imaging analysis. All codes and pre-trained models are available on our GitHub page https://github.com/JLiangLab/BenchmarkTransformers.</p>","PeriodicalId":72837,"journal":{"name":"Domain adaptation and representation transfer : 4th MICCAI Workshop, DART 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings. Domain Adaptation and Representation Transfer (Workshop) (4th : 2022 : Sin...","volume":" ","pages":"12-22"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9646404/pdf/nihms-1846236.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Domain adaptation and representation transfer : 4th MICCAI Workshop, DART 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings. Domain Adaptation and Representation Transfer (Workshop) (4th : 2022 : Sin...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-031-16852-9_2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/9/15 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Visual transformers have recently gained popularity in the computer vision community as they began to outrank convolutional neural networks (CNNs) in one representative visual benchmark after another. However, the competition between visual transformers and CNNs in medical imaging is rarely studied, leaving many important questions unanswered. As the first step, we benchmark how well existing transformer variants that use various (supervised and self-supervised) pre-training methods perform against CNNs on a variety of medical classification tasks. Furthermore, given the data-hungry nature of transformers and the annotation-deficiency challenge of medical imaging, we present a practical approach for bridging the domain gap between photographic and medical images by utilizing unlabeled large-scale in-domain data. Our extensive empirical evaluations reveal the following insights in medical imaging: (1) good initialization is more crucial for transformer-based models than for CNNs, (2) self-supervised learning based on masked image modeling captures more generalizable representations than supervised models, and (3) assembling a larger-scale domain-specific dataset can better bridge the domain gap between photographic and medical images via self-supervised continuous pre-training. We hope this benchmark study can direct future research on applying transformers to medical imaging analysis. All codes and pre-trained models are available on our GitHub page https://github.com/JLiangLab/BenchmarkTransformers.

查看原文本刊更多论文

用于医学图像分类的基准和提升变换器。

视觉变换器最近在计算机视觉领域大受欢迎，因为它们开始在一个又一个具有代表性的视觉基准测试中超越卷积神经网络（CNN）。然而，视觉变换器与卷积神经网络之间在医学成像领域的竞争却鲜有研究，导致许多重要问题悬而未决。作为第一步，我们对使用各种（监督和自我监督）预训练方法的现有变换器变体在各种医学分类任务中与 CNN 的表现进行了基准测试。此外，考虑到变换器的数据饥渴特性和医学影像的标注不足挑战，我们提出了一种实用的方法，利用未标注的大规模域内数据来弥合摄影和医学影像之间的领域差距。我们广泛的经验评估揭示了医学成像领域的以下启示：(1) 与 CNN 相比，良好的初始化对基于变换器的模型更为重要；(2) 与监督模型相比，基于遮蔽图像建模的自监督学习能捕捉到更多可泛化的表征；(3) 通过自监督持续预训练，组建更大规模的特定领域数据集能更好地弥合摄影图像与医学图像之间的领域差距。我们希望这项基准研究能指导未来将变换器应用于医学影像分析的研究。所有代码和预训练模型均可在我们的 GitHub 页面 https://github.com/JLiangLab/BenchmarkTransformers 上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Domain adaptation and representation transfer : 4th MICCAI Workshop, DART 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings. Domain Adaptation and Representation Transfer (Workshop) (4th : 2022 : Sin...

自引率

0.00%

发文量