{"title":"Recent advances of Transformers in medical image analysis: A comprehensive review","authors":"Kun Xia, Jinzhuo Wang","doi":"10.1002/mef2.38","DOIUrl":null,"url":null,"abstract":"<p>Recent works have shown that Transformer's excellent performances on natural language processing tasks can be maintained on natural image analysis tasks. However, the complicated clinical settings in medical image analysis and varied disease properties bring new challenges for the use of Transformer. The computer vision and medical engineering communities have devoted significant effort to medical image analysis research based on Transformer with especial focus on scenario-specific architectural variations. In this paper, we comprehensively review this rapidly developing area by covering the latest advances of Transformer-based methods in medical image analysis of different settings. We first give introduction of basic mechanisms of Transformer including implementations of selfattention and typical architectures. The important research problems in various medical image data modalities, clinical visual tasks, organs and diseases are then reviewed systemically. We carefully collect 276 very recent works and 76 public medical image analysis datasets in an organized structure. Finally, discussions on open problems and future research directions are also provided. We expect this review to be an up-to-date roadmap and serve as a reference source in pursuit of boosting the development of medical image analysis field.</p>","PeriodicalId":74135,"journal":{"name":"MedComm - Future medicine","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/mef2.38","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MedComm - Future medicine","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/mef2.38","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Recent works have shown that Transformer's excellent performances on natural language processing tasks can be maintained on natural image analysis tasks. However, the complicated clinical settings in medical image analysis and varied disease properties bring new challenges for the use of Transformer. The computer vision and medical engineering communities have devoted significant effort to medical image analysis research based on Transformer with especial focus on scenario-specific architectural variations. In this paper, we comprehensively review this rapidly developing area by covering the latest advances of Transformer-based methods in medical image analysis of different settings. We first give introduction of basic mechanisms of Transformer including implementations of selfattention and typical architectures. The important research problems in various medical image data modalities, clinical visual tasks, organs and diseases are then reviewed systemically. We carefully collect 276 very recent works and 76 public medical image analysis datasets in an organized structure. Finally, discussions on open problems and future research directions are also provided. We expect this review to be an up-to-date roadmap and serve as a reference source in pursuit of boosting the development of medical image analysis field.