Ziyao Zhang , Qiankun Ma , Tong Zhang , Jie Chen , Hairong Zheng , Wen Gao
{"title":"Switch-UMamba: Dynamic scanning vision Mamba UNet for medical image segmentation","authors":"Ziyao Zhang , Qiankun Ma , Tong Zhang , Jie Chen , Hairong Zheng , Wen Gao","doi":"10.1016/j.media.2025.103792","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, State Space Models (SSMs), particularly the Mamba-based framework, have demonstrated exceptional performance in medical image segmentation. This is attributed to their capacity to capture long-range dependencies efficiently with linear computational complexity. Nonetheless, current Mamba-based models encounter challenges in preserving the spatial context of 2D visual features, which is a consequence of their reliance on static 1D selective scanning patterns. In this study, we present Switch-UMamba, an innovative hybrid UNet framework that integrates local feature extraction power of Convolutional Neural Networks (CNNs) with the abilities of SSMs for capturing the long-range dependency. Switch-UMamba capitalizes on the Switch Visual State Space (VSS) module to leverage the Mixture-of-Scans (MoS) approach, a new scanning mechanism that amalgamates diverse scanning policies by considering each scan head as an expert within the Mixture-of-Experts (MoE) framework. MoS employs a router to dynamically allocate appropriate scanning policies and corresponding scan heads for each sample. This sparse-activated dynamic scanning approach not only ensures a rich and comprehensive acquisition of spatial information but also curtails computational expenses. Our comprehensive experimental evaluation on several medical image segmentation benchmarks indicates that Switch-UMamba has achieved state-of-the-art performances without using any pretrained weights. It is also worth highlighting that our approach outperforms other Mamba-based models with fewer parameters.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103792"},"PeriodicalIF":11.8000,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S136184152500338X","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, State Space Models (SSMs), particularly the Mamba-based framework, have demonstrated exceptional performance in medical image segmentation. This is attributed to their capacity to capture long-range dependencies efficiently with linear computational complexity. Nonetheless, current Mamba-based models encounter challenges in preserving the spatial context of 2D visual features, which is a consequence of their reliance on static 1D selective scanning patterns. In this study, we present Switch-UMamba, an innovative hybrid UNet framework that integrates local feature extraction power of Convolutional Neural Networks (CNNs) with the abilities of SSMs for capturing the long-range dependency. Switch-UMamba capitalizes on the Switch Visual State Space (VSS) module to leverage the Mixture-of-Scans (MoS) approach, a new scanning mechanism that amalgamates diverse scanning policies by considering each scan head as an expert within the Mixture-of-Experts (MoE) framework. MoS employs a router to dynamically allocate appropriate scanning policies and corresponding scan heads for each sample. This sparse-activated dynamic scanning approach not only ensures a rich and comprehensive acquisition of spatial information but also curtails computational expenses. Our comprehensive experimental evaluation on several medical image segmentation benchmarks indicates that Switch-UMamba has achieved state-of-the-art performances without using any pretrained weights. It is also worth highlighting that our approach outperforms other Mamba-based models with fewer parameters.
期刊介绍:
Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.