segamba - v2：用于一般3D医学图像分割的远程顺序建模曼巴。

IF 9.8 1区医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

IEEE Transactions on Medical Imaging Pub Date : 2025-07-18 DOI:10.1109/tmi.2025.3589797

Zhaohu Xing,Tian Ye,Yijun Yang,Du Cai,Baowen Gai,Xiao-Jian Wu,Feng Gao,Lei Zhu

{"title":"segamba - v2：用于一般3D医学图像分割的远程顺序建模曼巴。","authors":"Zhaohu Xing,Tian Ye,Yijun Yang,Du Cai,Baowen Gai,Xiao-Jian Wu,Feng Gao,Lei Zhu","doi":"10.1109/tmi.2025.3589797","DOIUrl":null,"url":null,"abstract":"The Transformer architecture has demonstrated remarkable results in 3D medical image segmentation due to its capability of modeling global relationships. However, it poses a significant computational burden when processing high-dimensional medical images. Mamba, as a State Space Model (SSM), has recently emerged as a notable approach for modeling long-range dependencies in sequential data. Although a substantial amount of Mamba-based research has focused on natural language and 2D image processing, few studies explore the capability of Mamba on 3D medical images. In this paper, we propose SegMamba-V2, a novel 3D medical image segmentation model, to effectively capture long-range dependencies within whole-volume features at each scale. To achieve this goal, we first devise a hierarchical scale downsampling strategy to enhance the receptive field and mitigate information loss during downsampling. Furthermore, we design a novel tri-orientated spatial Mamba block that extends the global dependency modeling process from one plane to three orthogonal planes to improve feature representation capability. Moreover, we collect and annotate a large-scale dataset (named CRC-2000) with fine-grained categories to facilitate benchmarking evaluation in 3D colorectal cancer (CRC) segmentation. We evaluate the effectiveness of our SegMamba-V2 on CRC-2000 and three other large-scale 3D medical image segmentation datasets, covering various modalities, organs, and segmentation targets. Experimental results demonstrate that our Segmamba-V2 outperforms state-of-the-art methods by a significant margin, which indicates the universality and effectiveness of the proposed model on 3D medical image segmentation tasks. The code for SegMamba-V2 is publicly available at: https://github.com/ge-xing/SegMamba-V2.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"18 1","pages":""},"PeriodicalIF":9.8000,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SegMamba-V2: Long-range Sequential Modeling Mamba For General 3D Medical Image Segmentation.\",\"authors\":\"Zhaohu Xing,Tian Ye,Yijun Yang,Du Cai,Baowen Gai,Xiao-Jian Wu,Feng Gao,Lei Zhu\",\"doi\":\"10.1109/tmi.2025.3589797\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Transformer architecture has demonstrated remarkable results in 3D medical image segmentation due to its capability of modeling global relationships. However, it poses a significant computational burden when processing high-dimensional medical images. Mamba, as a State Space Model (SSM), has recently emerged as a notable approach for modeling long-range dependencies in sequential data. Although a substantial amount of Mamba-based research has focused on natural language and 2D image processing, few studies explore the capability of Mamba on 3D medical images. In this paper, we propose SegMamba-V2, a novel 3D medical image segmentation model, to effectively capture long-range dependencies within whole-volume features at each scale. To achieve this goal, we first devise a hierarchical scale downsampling strategy to enhance the receptive field and mitigate information loss during downsampling. Furthermore, we design a novel tri-orientated spatial Mamba block that extends the global dependency modeling process from one plane to three orthogonal planes to improve feature representation capability. Moreover, we collect and annotate a large-scale dataset (named CRC-2000) with fine-grained categories to facilitate benchmarking evaluation in 3D colorectal cancer (CRC) segmentation. We evaluate the effectiveness of our SegMamba-V2 on CRC-2000 and three other large-scale 3D medical image segmentation datasets, covering various modalities, organs, and segmentation targets. Experimental results demonstrate that our Segmamba-V2 outperforms state-of-the-art methods by a significant margin, which indicates the universality and effectiveness of the proposed model on 3D medical image segmentation tasks. The code for SegMamba-V2 is publicly available at: https://github.com/ge-xing/SegMamba-V2.\",\"PeriodicalId\":13418,\"journal\":{\"name\":\"IEEE Transactions on Medical Imaging\",\"volume\":\"18 1\",\"pages\":\"\"},\"PeriodicalIF\":9.8000,\"publicationDate\":\"2025-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Medical Imaging\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1109/tmi.2025.3589797\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Medical Imaging","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/tmi.2025.3589797","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

Transformer架构由于其建模全局关系的能力，在三维医学图像分割中表现出显著的效果。然而，在处理高维医学图像时，它带来了巨大的计算负担。作为一种状态空间模型（SSM）， Mamba最近作为一种值得注意的方法出现，用于对顺序数据中的远程依赖关系进行建模。尽管大量基于曼巴的研究集中在自然语言和2D图像处理上，但很少有研究探索曼巴在3D医学图像上的能力。在本文中，我们提出了一种新的三维医学图像分割模型segamba - v2，以有效地捕获每个尺度下全体积特征中的远程依赖关系。为了实现这一目标，我们首先设计了一种分层尺度降采样策略来增强感受野并减轻降采样过程中的信息丢失。此外，我们设计了一种新颖的三向空间曼巴块，将全局依赖建模过程从一个平面扩展到三个正交平面，以提高特征表示能力。此外，我们收集并标注了一个具有细粒度分类的大规模数据集（CRC -2000），以便于在三维结直肠癌（CRC）分割中进行基准评估。我们评估了我们的分段- v2在CRC-2000和其他三个大规模三维医学图像分割数据集上的有效性，这些数据集涵盖了各种模式、器官和分割目标。实验结果表明，我们的segamba - v2比目前最先进的方法要好得多，这表明所提出的模型在3D医学图像分割任务上的通用性和有效性。segamba - v2的代码可以在https://github.com/ge-xing/SegMamba-V2上公开获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SegMamba-V2: Long-range Sequential Modeling Mamba For General 3D Medical Image Segmentation.

The Transformer architecture has demonstrated remarkable results in 3D medical image segmentation due to its capability of modeling global relationships. However, it poses a significant computational burden when processing high-dimensional medical images. Mamba, as a State Space Model (SSM), has recently emerged as a notable approach for modeling long-range dependencies in sequential data. Although a substantial amount of Mamba-based research has focused on natural language and 2D image processing, few studies explore the capability of Mamba on 3D medical images. In this paper, we propose SegMamba-V2, a novel 3D medical image segmentation model, to effectively capture long-range dependencies within whole-volume features at each scale. To achieve this goal, we first devise a hierarchical scale downsampling strategy to enhance the receptive field and mitigate information loss during downsampling. Furthermore, we design a novel tri-orientated spatial Mamba block that extends the global dependency modeling process from one plane to three orthogonal planes to improve feature representation capability. Moreover, we collect and annotate a large-scale dataset (named CRC-2000) with fine-grained categories to facilitate benchmarking evaluation in 3D colorectal cancer (CRC) segmentation. We evaluate the effectiveness of our SegMamba-V2 on CRC-2000 and three other large-scale 3D medical image segmentation datasets, covering various modalities, organs, and segmentation targets. Experimental results demonstrate that our Segmamba-V2 outperforms state-of-the-art methods by a significant margin, which indicates the universality and effectiveness of the proposed model on 3D medical image segmentation tasks. The code for SegMamba-V2 is publicly available at: https://github.com/ge-xing/SegMamba-V2.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Medical Imaging 医学-成像科学与照相技术

CiteScore

21.80

自引率

5.70%

发文量

637

审稿时长

5.6 months

期刊介绍： The IEEE Transactions on Medical Imaging (T-MI) is a journal that welcomes the submission of manuscripts focusing on various aspects of medical imaging. The journal encourages the exploration of body structure, morphology, and function through different imaging techniques, including ultrasound, X-rays, magnetic resonance, radionuclides, microwaves, and optical methods. It also promotes contributions related to cell and molecular imaging, as well as all forms of microscopy. T-MI publishes original research papers that cover a wide range of topics, including but not limited to novel acquisition techniques, medical image processing and analysis, visualization and performance, pattern recognition, machine learning, and other related methods. The journal particularly encourages highly technical studies that offer new perspectives. By emphasizing the unification of medicine, biology, and imaging, T-MI seeks to bridge the gap between instrumentation, hardware, software, mathematics, physics, biology, and medicine by introducing new analysis methods. While the journal welcomes strong application papers that describe novel methods, it directs papers that focus solely on important applications using medically adopted or well-established methods without significant innovation in methodology to other journals. T-MI is indexed in Pubmed® and Medline®, which are products of the United States National Library of Medicine.