Georg Hille, Pavan Tummala, Lena Spitz, Sylvia Saalfeld
{"title":"Transformers for colorectal cancer segmentation in CT imaging.","authors":"Georg Hille, Pavan Tummala, Lena Spitz, Sylvia Saalfeld","doi":"10.1007/s11548-024-03217-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Most recently transformer models became the state of the art in various medical image segmentation tasks and challenges, outperforming most of the conventional deep learning approaches. Picking up on that trend, this study aims at applying various transformer models to the highly challenging task of colorectal cancer (CRC) segmentation in CT imaging and assessing how they hold up to the current state-of-the-art convolutional neural network (CNN), the nnUnet. Furthermore, we wanted to investigate the impact of the network size on the resulting accuracies, since transformer models tend to be significantly larger than conventional network architectures.</p><p><strong>Methods: </strong>For this purpose, six different transformer models, with specific architectural advancements and network sizes were implemented alongside the aforementioned nnUnet and were applied to the CRC segmentation task of the medical segmentation decathlon.</p><p><strong>Results: </strong>The best results were achieved with the Swin-UNETR, D-Former, and VT-Unet, each transformer models, with a Dice similarity coefficient (DSC) of 0.60, 0.59 and 0.59, respectively. Therefore, the current state-of-the-art CNN, the nnUnet could be outperformed by transformer architectures regarding this task. Furthermore, a comparison with the inter-observer variability (IOV) of approx. 0.64 DSC indicates almost expert-level accuracy. The comparatively low IOV emphasizes the complexity and challenge of CRC segmentation, as well as indicating limitations regarding the achievable segmentation accuracy.</p><p><strong>Conclusion: </strong>As a result of this study, transformer models underline their current upward trend in producing state-of-the-art results also for the challenging task of CRC segmentation. However, with ever smaller advances in total accuracies, as demonstrated in this study by the on par performances of multiple network variants, other advantages like efficiency, low computation demands, or ease of adaption to new tasks become more and more relevant.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-024-03217-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/4 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Most recently transformer models became the state of the art in various medical image segmentation tasks and challenges, outperforming most of the conventional deep learning approaches. Picking up on that trend, this study aims at applying various transformer models to the highly challenging task of colorectal cancer (CRC) segmentation in CT imaging and assessing how they hold up to the current state-of-the-art convolutional neural network (CNN), the nnUnet. Furthermore, we wanted to investigate the impact of the network size on the resulting accuracies, since transformer models tend to be significantly larger than conventional network architectures.
Methods: For this purpose, six different transformer models, with specific architectural advancements and network sizes were implemented alongside the aforementioned nnUnet and were applied to the CRC segmentation task of the medical segmentation decathlon.
Results: The best results were achieved with the Swin-UNETR, D-Former, and VT-Unet, each transformer models, with a Dice similarity coefficient (DSC) of 0.60, 0.59 and 0.59, respectively. Therefore, the current state-of-the-art CNN, the nnUnet could be outperformed by transformer architectures regarding this task. Furthermore, a comparison with the inter-observer variability (IOV) of approx. 0.64 DSC indicates almost expert-level accuracy. The comparatively low IOV emphasizes the complexity and challenge of CRC segmentation, as well as indicating limitations regarding the achievable segmentation accuracy.
Conclusion: As a result of this study, transformer models underline their current upward trend in producing state-of-the-art results also for the challenging task of CRC segmentation. However, with ever smaller advances in total accuracies, as demonstrated in this study by the on par performances of multiple network variants, other advantages like efficiency, low computation demands, or ease of adaption to new tasks become more and more relevant.
期刊介绍:
The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.