H-vmunet: High-order Vision Mamba UNet for medical image segmentation

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2025-01-19 DOI:10.1016/j.neucom.2025.129447

Renkai Wu , Yinghao Liu , Pengchen Liang , Qing Chang

{"title":"H-vmunet: High-order Vision Mamba UNet for medical image segmentation","authors":"Renkai Wu , Yinghao Liu , Pengchen Liang , Qing Chang","doi":"10.1016/j.neucom.2025.129447","DOIUrl":null,"url":null,"abstract":"<div><div>In the field of medical image segmentation, variant models based on Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have been extensively developed and applied. However, CNNs often struggle with processing long-sequence information, while Vision Transformers exhibit low sensitivity to local feature extraction and face challenges with computational complexity. Recently, the emergence of state-space models (SSMs), particularly 2D-selective-scan (SS2D), has challenged the longtime dominance of traditional CNNs and ViTs as foundational modules in visual neural networks. In this paper, we extend the adaptation of SS2D by proposing a High-order Vision Mamba UNet (H-vmunet) model for medical image segmentation. Among them, the H-vmunet model includes the proposed novel High-order 2D-selective-scan (H-SS2D) and Local-SS2D module. The H-SS2D is used to reduce the introduction of redundant information when SS2D features are feature-learning in the global receptive field. The Local-SS2D module is used to improve the learning ability of local features in SS2D. We conducted comprehensive comparison and ablation experiments on three publicly available medical image datasets (ISIC2017, Spleen, and CVC-ClinicDB), and the results consistently demonstrate the strong performance of H-vmunet in medical image segmentation tasks. The code is available from <span><span>https://github.com/wurenkai/H-vmunet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"624 ","pages":"Article 129447"},"PeriodicalIF":5.5000,"publicationDate":"2025-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225001195","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In the field of medical image segmentation, variant models based on Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have been extensively developed and applied. However, CNNs often struggle with processing long-sequence information, while Vision Transformers exhibit low sensitivity to local feature extraction and face challenges with computational complexity. Recently, the emergence of state-space models (SSMs), particularly 2D-selective-scan (SS2D), has challenged the longtime dominance of traditional CNNs and ViTs as foundational modules in visual neural networks. In this paper, we extend the adaptation of SS2D by proposing a High-order Vision Mamba UNet (H-vmunet) model for medical image segmentation. Among them, the H-vmunet model includes the proposed novel High-order 2D-selective-scan (H-SS2D) and Local-SS2D module. The H-SS2D is used to reduce the introduction of redundant information when SS2D features are feature-learning in the global receptive field. The Local-SS2D module is used to improve the learning ability of local features in SS2D. We conducted comprehensive comparison and ablation experiments on three publicly available medical image datasets (ISIC2017, Spleen, and CVC-ClinicDB), and the results consistently demonstrate the strong performance of H-vmunet in medical image segmentation tasks. The code is available from https://github.com/wurenkai/H-vmunet.

查看原文本刊更多论文

H-vmunet：用于医学图像分割的高阶视觉曼巴网络

在医学图像分割领域，基于卷积神经网络（cnn）和视觉变换（ViTs）的变体模型得到了广泛的发展和应用。然而，cnn经常在处理长序列信息方面遇到困难，而视觉变换对局部特征提取的敏感性较低，并且面临计算复杂度的挑战。最近，状态空间模型（ssm）的出现，特别是2d选择性扫描（SS2D），挑战了传统cnn和vit作为视觉神经网络基础模块的长期统治地位。在本文中，我们通过提出一种用于医学图像分割的高阶视觉曼巴UNet （H-vmunet）模型来扩展SS2D的适应性。其中，H-vmunet模型包括提出的新型高阶2d选择性扫描（H-SS2D）和Local-SS2D模块。当SS2D特征在全局接受域中进行特征学习时，使用H-SS2D来减少冗余信息的引入。local -SS2D模块用于提高SS2D中局部特征的学习能力。我们对三个公开的医学图像数据集（ISIC2017、脾脏和CVC-ClinicDB）进行了全面的对比和消融实验，结果一致证明了H-vmunet在医学图像分割任务中的强大性能。该代码可从https://github.com/wurenkai/H-vmunet获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.