Mamba-Convolutional UNet for multi-modal medical image synthesis

IF 3.2 2区医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Medical physics Pub Date : 2025-09-24 DOI:10.1002/mp.70029

WenLong Lin, Yu Luo, Jie Ling, FengHuan Li, Jing Qin, ZhiChao Yin, Shun Yao

{"title":"Mamba-Convolutional UNet for multi-modal medical image synthesis","authors":"WenLong Lin, Yu Luo, Jie Ling, FengHuan Li, Jing Qin, ZhiChao Yin, Shun Yao","doi":"10.1002/mp.70029","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Preoperative evaluation frequently relies on multi-modal medical imaging to provide comprehensive anatomical and functional insights. However, acquiring such multi-modal data often involves high scanning costs and logistical challenges. Additionally, in practical applications, it is inconvenient to collect sufficient matched multi-modal data to train different models for different cross-modality synthesis tasks.</p>\n </section>\n \n <section>\n \n <h3> Purpose</h3>\n \n <p>To address these issues, we propose a novel dual-branch architecture, named Mamba-Convolutional UNet, for multi-modal medical image synthesis. Furthermore, to enable cross-modal synthesis capabilities even under data scarcity, we address the practical challenge of limited paired multi-modal training data by introducing a simple reprogramming layer.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>The proposed Mamba-Convolutional UNet adopts a U-shaped architecture featuring parallel SSM and convolutional branches. The SSM branch leverages Mamba to capture long-range dependencies and global context, while the convolutional branch extracts fine-grained local features through spatial operations. Then, an attention mechanism is utilized to integrate global and local features. To enhance adaptability across modalities with limited data, a lightweight reprogramming layer is incorporated into the Mamba module, allowing knowledge transfer from one cross-modal synthesis task to another without requiring extensive retraining.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>We conducted five multi-modal medical image synthesis tasks on three datasets to validate the performance of our model. The results demonstrate that the performance of Mamba-Convolutional UNet significantly outperforms that of six baseline models. Moreover, Mamba-Convolutional UNet can attain comparable performance to the current state-of-the-art methods by fine-tuning the model for other synthesis tasks with only 25% of the data.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>The proposed Mamba-Convolutional UNet features a dual-branch structure that effectively combines global and local features for enhanced medical image understanding. And the Mamba block's reprogramming layer addresses challenges in target modality transformation during insufficient training.</p>\n </section>\n </div>","PeriodicalId":18384,"journal":{"name":"Medical physics","volume":"52 10","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"3","ListUrlMain":"https://aapm.onlinelibrary.wiley.com/doi/10.1002/mp.70029","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Preoperative evaluation frequently relies on multi-modal medical imaging to provide comprehensive anatomical and functional insights. However, acquiring such multi-modal data often involves high scanning costs and logistical challenges. Additionally, in practical applications, it is inconvenient to collect sufficient matched multi-modal data to train different models for different cross-modality synthesis tasks.

Purpose

To address these issues, we propose a novel dual-branch architecture, named Mamba-Convolutional UNet, for multi-modal medical image synthesis. Furthermore, to enable cross-modal synthesis capabilities even under data scarcity, we address the practical challenge of limited paired multi-modal training data by introducing a simple reprogramming layer.

Methods

The proposed Mamba-Convolutional UNet adopts a U-shaped architecture featuring parallel SSM and convolutional branches. The SSM branch leverages Mamba to capture long-range dependencies and global context, while the convolutional branch extracts fine-grained local features through spatial operations. Then, an attention mechanism is utilized to integrate global and local features. To enhance adaptability across modalities with limited data, a lightweight reprogramming layer is incorporated into the Mamba module, allowing knowledge transfer from one cross-modal synthesis task to another without requiring extensive retraining.

Results

We conducted five multi-modal medical image synthesis tasks on three datasets to validate the performance of our model. The results demonstrate that the performance of Mamba-Convolutional UNet significantly outperforms that of six baseline models. Moreover, Mamba-Convolutional UNet can attain comparable performance to the current state-of-the-art methods by fine-tuning the model for other synthesis tasks with only 25% of the data.

Conclusions

The proposed Mamba-Convolutional UNet features a dual-branch structure that effectively combines global and local features for enhanced medical image understanding. And the Mamba block's reprogramming layer addresses challenges in target modality transformation during insufficient training.

Abstract Image

查看原文本刊更多论文

mamba -卷积UNet用于多模态医学图像合成。

背景：术前评估经常依赖于多模态医学成像来提供全面的解剖和功能见解。然而，获取这种多模态数据往往涉及高扫描成本和后勤挑战。此外，在实际应用中，很难收集到足够匹配的多模态数据来训练不同的模型来完成不同的跨模态综合任务。为了解决这些问题，我们提出了一种新的双分支架构，称为mamba -卷积UNet，用于多模态医学图像合成。此外，为了在数据稀缺的情况下实现跨模态综合能力，我们通过引入简单的重编程层来解决有限配对多模态训练数据的实际挑战。方法：提出的mamba -卷积UNet采用并行SSM和卷积分支的u型结构。SSM分支利用Mamba捕获远程依赖关系和全局上下文，而卷积分支通过空间操作提取细粒度的局部特征。然后，利用注意机制整合全局特征和局部特征。为了在有限数据的情况下增强跨模态的适应性，Mamba模块中加入了一个轻量级的重编程层，允许知识从一个跨模态合成任务转移到另一个跨模态合成任务，而无需进行大量的再培训。结果：我们在三个数据集上进行了五个多模态医学图像合成任务，以验证我们的模型的性能。结果表明，mamba -卷积UNet的性能明显优于6个基线模型。此外，mamba -卷积UNet可以获得与当前最先进的方法相当的性能，通过微调模型，仅使用25%的数据进行其他合成任务。结论：提出的mamba -卷积UNet具有双分支结构，有效地结合了全局和局部特征，增强了医学图像的理解。曼巴块的重编程层解决了在训练不足时目标模态转换的挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Medical physics 医学-核医学

CiteScore

6.80

自引率

15.80%

发文量

660

审稿时长

1.7 months

期刊介绍： Medical Physics publishes original, high impact physics, imaging science, and engineering research that advances patient diagnosis and therapy through contributions in 1) Basic science developments with high potential for clinical translation 2) Clinical applications of cutting edge engineering and physics innovations 3) Broadly applicable and innovative clinical physics developments Medical Physics is a journal of global scope and reach. By publishing in Medical Physics your research will reach an international, multidisciplinary audience including practicing medical physicists as well as physics- and engineering based translational scientists. We work closely with authors of promising articles to improve their quality.