Switch-UMamba: Dynamic scanning vision Mamba UNet for medical image segmentation

IF 11.8 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis Pub Date : 2025-09-10 DOI:10.1016/j.media.2025.103792

Ziyao Zhang , Qiankun Ma , Tong Zhang , Jie Chen , Hairong Zheng , Wen Gao

{"title":"Switch-UMamba: Dynamic scanning vision Mamba UNet for medical image segmentation","authors":"Ziyao Zhang , Qiankun Ma , Tong Zhang , Jie Chen , Hairong Zheng , Wen Gao","doi":"10.1016/j.media.2025.103792","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, State Space Models (SSMs), particularly the Mamba-based framework, have demonstrated exceptional performance in medical image segmentation. This is attributed to their capacity to capture long-range dependencies efficiently with linear computational complexity. Nonetheless, current Mamba-based models encounter challenges in preserving the spatial context of 2D visual features, which is a consequence of their reliance on static 1D selective scanning patterns. In this study, we present Switch-UMamba, an innovative hybrid UNet framework that integrates local feature extraction power of Convolutional Neural Networks (CNNs) with the abilities of SSMs for capturing the long-range dependency. Switch-UMamba capitalizes on the Switch Visual State Space (VSS) module to leverage the Mixture-of-Scans (MoS) approach, a new scanning mechanism that amalgamates diverse scanning policies by considering each scan head as an expert within the Mixture-of-Experts (MoE) framework. MoS employs a router to dynamically allocate appropriate scanning policies and corresponding scan heads for each sample. This sparse-activated dynamic scanning approach not only ensures a rich and comprehensive acquisition of spatial information but also curtails computational expenses. Our comprehensive experimental evaluation on several medical image segmentation benchmarks indicates that Switch-UMamba has achieved state-of-the-art performances without using any pretrained weights. It is also worth highlighting that our approach outperforms other Mamba-based models with fewer parameters.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103792"},"PeriodicalIF":11.8000,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S136184152500338X","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, State Space Models (SSMs), particularly the Mamba-based framework, have demonstrated exceptional performance in medical image segmentation. This is attributed to their capacity to capture long-range dependencies efficiently with linear computational complexity. Nonetheless, current Mamba-based models encounter challenges in preserving the spatial context of 2D visual features, which is a consequence of their reliance on static 1D selective scanning patterns. In this study, we present Switch-UMamba, an innovative hybrid UNet framework that integrates local feature extraction power of Convolutional Neural Networks (CNNs) with the abilities of SSMs for capturing the long-range dependency. Switch-UMamba capitalizes on the Switch Visual State Space (VSS) module to leverage the Mixture-of-Scans (MoS) approach, a new scanning mechanism that amalgamates diverse scanning policies by considering each scan head as an expert within the Mixture-of-Experts (MoE) framework. MoS employs a router to dynamically allocate appropriate scanning policies and corresponding scan heads for each sample. This sparse-activated dynamic scanning approach not only ensures a rich and comprehensive acquisition of spatial information but also curtails computational expenses. Our comprehensive experimental evaluation on several medical image segmentation benchmarks indicates that Switch-UMamba has achieved state-of-the-art performances without using any pretrained weights. It is also worth highlighting that our approach outperforms other Mamba-based models with fewer parameters.

查看原文本刊更多论文

Switch-UMamba：用于医学图像分割的动态扫描视觉Mamba UNet。

最近，状态空间模型（SSMs），特别是基于mamba的框架，在医学图像分割中表现出了优异的性能。这是由于它们能够以线性计算复杂性有效地捕获远程依赖关系。尽管如此，目前基于曼巴的模型在保留2D视觉特征的空间背景方面遇到了挑战，这是它们依赖于静态1D选择性扫描模式的结果。在这项研究中，我们提出了Switch-UMamba，这是一种创新的混合UNet框架，它将卷积神经网络（cnn）的局部特征提取能力与ssm捕获远程依赖关系的能力相结合。Switch- umamba利用Switch视觉状态空间（VSS）模块来利用混合扫描（MoS）方法，这是一种新的扫描机制，通过将每个扫描头视为混合专家（MoE）框架中的专家，将不同的扫描策略合并在一起。MoS通过路由器为每个样本动态分配合适的扫描策略和相应的扫描头。这种稀疏激活的动态扫描方法不仅保证了空间信息的丰富和全面获取，而且减少了计算开销。我们对几个医学图像分割基准的综合实验评估表明，Switch-UMamba在不使用任何预训练权重的情况下取得了最先进的性能。同样值得强调的是，我们的方法优于其他参数较少的基于曼巴的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Medical image analysis 工程技术-工程：生物医学

CiteScore

22.10

自引率

6.40%

发文量

309

审稿时长

6.6 months

期刊介绍： Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.