WenLong Lin, Yu Luo, Jie Ling, FengHuan Li, Jing Qin, ZhiChao Yin, Shun Yao
{"title":"mamba -卷积UNet用于多模态医学图像合成。","authors":"WenLong Lin, Yu Luo, Jie Ling, FengHuan Li, Jing Qin, ZhiChao Yin, Shun Yao","doi":"10.1002/mp.70029","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Preoperative evaluation frequently relies on multi-modal medical imaging to provide comprehensive anatomical and functional insights. However, acquiring such multi-modal data often involves high scanning costs and logistical challenges. Additionally, in practical applications, it is inconvenient to collect sufficient matched multi-modal data to train different models for different cross-modality synthesis tasks.</p>\n </section>\n \n <section>\n \n <h3> Purpose</h3>\n \n <p>To address these issues, we propose a novel dual-branch architecture, named Mamba-Convolutional UNet, for multi-modal medical image synthesis. Furthermore, to enable cross-modal synthesis capabilities even under data scarcity, we address the practical challenge of limited paired multi-modal training data by introducing a simple reprogramming layer.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>The proposed Mamba-Convolutional UNet adopts a U-shaped architecture featuring parallel SSM and convolutional branches. The SSM branch leverages Mamba to capture long-range dependencies and global context, while the convolutional branch extracts fine-grained local features through spatial operations. Then, an attention mechanism is utilized to integrate global and local features. To enhance adaptability across modalities with limited data, a lightweight reprogramming layer is incorporated into the Mamba module, allowing knowledge transfer from one cross-modal synthesis task to another without requiring extensive retraining.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>We conducted five multi-modal medical image synthesis tasks on three datasets to validate the performance of our model. The results demonstrate that the performance of Mamba-Convolutional UNet significantly outperforms that of six baseline models. Moreover, Mamba-Convolutional UNet can attain comparable performance to the current state-of-the-art methods by fine-tuning the model for other synthesis tasks with only 25% of the data.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>The proposed Mamba-Convolutional UNet features a dual-branch structure that effectively combines global and local features for enhanced medical image understanding. And the Mamba block's reprogramming layer addresses challenges in target modality transformation during insufficient training.</p>\n </section>\n </div>","PeriodicalId":18384,"journal":{"name":"Medical physics","volume":"52 10","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mamba-Convolutional UNet for multi-modal medical image synthesis\",\"authors\":\"WenLong Lin, Yu Luo, Jie Ling, FengHuan Li, Jing Qin, ZhiChao Yin, Shun Yao\",\"doi\":\"10.1002/mp.70029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>Preoperative evaluation frequently relies on multi-modal medical imaging to provide comprehensive anatomical and functional insights. However, acquiring such multi-modal data often involves high scanning costs and logistical challenges. Additionally, in practical applications, it is inconvenient to collect sufficient matched multi-modal data to train different models for different cross-modality synthesis tasks.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Purpose</h3>\\n \\n <p>To address these issues, we propose a novel dual-branch architecture, named Mamba-Convolutional UNet, for multi-modal medical image synthesis. Furthermore, to enable cross-modal synthesis capabilities even under data scarcity, we address the practical challenge of limited paired multi-modal training data by introducing a simple reprogramming layer.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>The proposed Mamba-Convolutional UNet adopts a U-shaped architecture featuring parallel SSM and convolutional branches. The SSM branch leverages Mamba to capture long-range dependencies and global context, while the convolutional branch extracts fine-grained local features through spatial operations. Then, an attention mechanism is utilized to integrate global and local features. To enhance adaptability across modalities with limited data, a lightweight reprogramming layer is incorporated into the Mamba module, allowing knowledge transfer from one cross-modal synthesis task to another without requiring extensive retraining.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>We conducted five multi-modal medical image synthesis tasks on three datasets to validate the performance of our model. The results demonstrate that the performance of Mamba-Convolutional UNet significantly outperforms that of six baseline models. Moreover, Mamba-Convolutional UNet can attain comparable performance to the current state-of-the-art methods by fine-tuning the model for other synthesis tasks with only 25% of the data.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>The proposed Mamba-Convolutional UNet features a dual-branch structure that effectively combines global and local features for enhanced medical image understanding. And the Mamba block's reprogramming layer addresses challenges in target modality transformation during insufficient training.</p>\\n </section>\\n </div>\",\"PeriodicalId\":18384,\"journal\":{\"name\":\"Medical physics\",\"volume\":\"52 10\",\"pages\":\"\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical physics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://aapm.onlinelibrary.wiley.com/doi/10.1002/mp.70029\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"3","ListUrlMain":"https://aapm.onlinelibrary.wiley.com/doi/10.1002/mp.70029","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
Mamba-Convolutional UNet for multi-modal medical image synthesis
Background
Preoperative evaluation frequently relies on multi-modal medical imaging to provide comprehensive anatomical and functional insights. However, acquiring such multi-modal data often involves high scanning costs and logistical challenges. Additionally, in practical applications, it is inconvenient to collect sufficient matched multi-modal data to train different models for different cross-modality synthesis tasks.
Purpose
To address these issues, we propose a novel dual-branch architecture, named Mamba-Convolutional UNet, for multi-modal medical image synthesis. Furthermore, to enable cross-modal synthesis capabilities even under data scarcity, we address the practical challenge of limited paired multi-modal training data by introducing a simple reprogramming layer.
Methods
The proposed Mamba-Convolutional UNet adopts a U-shaped architecture featuring parallel SSM and convolutional branches. The SSM branch leverages Mamba to capture long-range dependencies and global context, while the convolutional branch extracts fine-grained local features through spatial operations. Then, an attention mechanism is utilized to integrate global and local features. To enhance adaptability across modalities with limited data, a lightweight reprogramming layer is incorporated into the Mamba module, allowing knowledge transfer from one cross-modal synthesis task to another without requiring extensive retraining.
Results
We conducted five multi-modal medical image synthesis tasks on three datasets to validate the performance of our model. The results demonstrate that the performance of Mamba-Convolutional UNet significantly outperforms that of six baseline models. Moreover, Mamba-Convolutional UNet can attain comparable performance to the current state-of-the-art methods by fine-tuning the model for other synthesis tasks with only 25% of the data.
Conclusions
The proposed Mamba-Convolutional UNet features a dual-branch structure that effectively combines global and local features for enhanced medical image understanding. And the Mamba block's reprogramming layer addresses challenges in target modality transformation during insufficient training.
期刊介绍:
Medical Physics publishes original, high impact physics, imaging science, and engineering research that advances patient diagnosis and therapy through contributions in 1) Basic science developments with high potential for clinical translation 2) Clinical applications of cutting edge engineering and physics innovations 3) Broadly applicable and innovative clinical physics developments
Medical Physics is a journal of global scope and reach. By publishing in Medical Physics your research will reach an international, multidisciplinary audience including practicing medical physicists as well as physics- and engineering based translational scientists. We work closely with authors of promising articles to improve their quality.