{"title":"Brain tumor image segmentation based on shuffle transformer-dynamic convolution and inception dilated convolution","authors":"Lifang Zhou , Ya Wang","doi":"10.1016/j.cviu.2025.104324","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate segmentation of brain tumors is essential for accurate clinical diagnosis and effective treatment. Convolutional neural networks (CNNs) have improved brain tumor segmentation with their excellent performance in local feature modeling. However, they still face the challenge of unpredictable changes in tumor size and location, because it cannot be effectively matched by CNN-based methods with local and regular receptive fields. To overcome these obstacles, we propose brain tumor image segmentation based on shuffle transformer-dynamic convolution and inception dilated convolution that captures and adapts different features of tumors through multi-scale feature extraction. Our model combines Shuffle Transformer-Dynamic Convolution (STDC) to capture both fine-grained and contextual image details so that it helps improve localization accuracy. In addition, the Inception Dilated Convolution(IDConv) module solves the problem of significant changes in the size of brain tumors, and then captures the information of different size of object. The multi-scale feature aggregation(MSFA) module integrates features from different encoder levels, which contributes to enriching the scale diversity of input patches and enhancing the robustness of segmentation. The experimental results conducted on the BraTS 2019, BraTS 2020, BraTS 2021, and MSD BTS datasets indicate that our model outperforms other state-of-the-art methods in terms of accuracy.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"254 ","pages":"Article 104324"},"PeriodicalIF":4.3000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314225000475","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate segmentation of brain tumors is essential for accurate clinical diagnosis and effective treatment. Convolutional neural networks (CNNs) have improved brain tumor segmentation with their excellent performance in local feature modeling. However, they still face the challenge of unpredictable changes in tumor size and location, because it cannot be effectively matched by CNN-based methods with local and regular receptive fields. To overcome these obstacles, we propose brain tumor image segmentation based on shuffle transformer-dynamic convolution and inception dilated convolution that captures and adapts different features of tumors through multi-scale feature extraction. Our model combines Shuffle Transformer-Dynamic Convolution (STDC) to capture both fine-grained and contextual image details so that it helps improve localization accuracy. In addition, the Inception Dilated Convolution(IDConv) module solves the problem of significant changes in the size of brain tumors, and then captures the information of different size of object. The multi-scale feature aggregation(MSFA) module integrates features from different encoder levels, which contributes to enriching the scale diversity of input patches and enhancing the robustness of segmentation. The experimental results conducted on the BraTS 2019, BraTS 2020, BraTS 2021, and MSD BTS datasets indicate that our model outperforms other state-of-the-art methods in terms of accuracy.
期刊介绍:
The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views.
Research Areas Include:
• Theory
• Early vision
• Data structures and representations
• Shape
• Range
• Motion
• Matching and recognition
• Architecture and languages
• Vision systems