Exploring a Novel Conv-Transformer Network for Multi-Modality Heart Segmentation

iRadiology Pub Date : 2026-03-03 Epub Date: 2025-10-16 DOI:10.1002/ird3.70028
Youyou Ding, Hao Dang, Jiayi Luo, Xiaoyu Zhuo, Ningyu Huang, Junsheng Xiao, Zongwang Lv
{"title":"Exploring a Novel Conv-Transformer Network for Multi-Modality Heart Segmentation","authors":"Youyou Ding,&nbsp;Hao Dang,&nbsp;Jiayi Luo,&nbsp;Xiaoyu Zhuo,&nbsp;Ningyu Huang,&nbsp;Junsheng Xiao,&nbsp;Zongwang Lv","doi":"10.1002/ird3.70028","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>In recent years, deep convolutional neural networks (CNNs) have achieved great successes in medical imaging. However, it is difficult to obtain accurate pathological information for clinical diagnosis and treatment by leveraging single-modality medical images. This study aims to provide an efficient multimodality whole heart segmentation method for the diagnosis of coronary heart disease.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>We propose SFAM-TransUnet for multimodality whole heart segmentation, a novel deep learning framework combining CNNs and transformers. Primarily, the method integrates CNNs and visual transformers (Vits) into a unified fusion framework. Specifically, the shallow feature fusion module is designed to connect MRI and CT images, thereby providing a powerful and efficient multimodality fusion backbone for semantic segmentation. Furthermore, we propose a fusion ViT (FViT) module including self-attention (SA) and adaptive mutual boost attention (Ada-MBA) to enhance contextual information within and across modalities. The Ada-MBA module assigns attention to semantic perception regions by calculating SA and cross-attention, which improves the ability to understand context from the different modalities. Extensive experiments are conducted on the clinical Multi-Modality Whole Heart Segmentation datasets.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>We successfully improved the whole heart segmentation DSCs to 0.902 (AA), 0.920 (LV-blood), 0.863 (LA-blood), and 0.837 (LV-myo), the HDs to 9.886 (AA), 9.947 (LV-blood), 11.911 (LA-blood), and 13.599 (LV-myo), the PSNR values to 33.577 (AA), 30.091 (LV-blood), 32.055 (LA-blood), and 29.837 (LV-myo), SSMI values to 0.901 (AA), 0.818 (LV-blood), 0.765 (LA-blood), and 0.743 (LV-myo). This demonstrate SFAM-TransUnet outperforms various alternative methods.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>We propose SFAM-TransUnet, an efficient framework tailored for whole heart segmentation that combines CNNs and transformers. It provides a powerful multimodality fusion network to improve the performance of whole heart semantic segmentation. These results demonstrate the efficacy of SFAM-TransUnet in integrating relevant information between different modalities in multimodal tasks.</p>\n </section>\n </div>","PeriodicalId":73508,"journal":{"name":"iRadiology","volume":"4 1","pages":"13-22"},"PeriodicalIF":0.0000,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ird3.70028","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"iRadiology","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ird3.70028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/10/16 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background

In recent years, deep convolutional neural networks (CNNs) have achieved great successes in medical imaging. However, it is difficult to obtain accurate pathological information for clinical diagnosis and treatment by leveraging single-modality medical images. This study aims to provide an efficient multimodality whole heart segmentation method for the diagnosis of coronary heart disease.

Methods

We propose SFAM-TransUnet for multimodality whole heart segmentation, a novel deep learning framework combining CNNs and transformers. Primarily, the method integrates CNNs and visual transformers (Vits) into a unified fusion framework. Specifically, the shallow feature fusion module is designed to connect MRI and CT images, thereby providing a powerful and efficient multimodality fusion backbone for semantic segmentation. Furthermore, we propose a fusion ViT (FViT) module including self-attention (SA) and adaptive mutual boost attention (Ada-MBA) to enhance contextual information within and across modalities. The Ada-MBA module assigns attention to semantic perception regions by calculating SA and cross-attention, which improves the ability to understand context from the different modalities. Extensive experiments are conducted on the clinical Multi-Modality Whole Heart Segmentation datasets.

Results

We successfully improved the whole heart segmentation DSCs to 0.902 (AA), 0.920 (LV-blood), 0.863 (LA-blood), and 0.837 (LV-myo), the HDs to 9.886 (AA), 9.947 (LV-blood), 11.911 (LA-blood), and 13.599 (LV-myo), the PSNR values to 33.577 (AA), 30.091 (LV-blood), 32.055 (LA-blood), and 29.837 (LV-myo), SSMI values to 0.901 (AA), 0.818 (LV-blood), 0.765 (LA-blood), and 0.743 (LV-myo). This demonstrate SFAM-TransUnet outperforms various alternative methods.

Conclusions

We propose SFAM-TransUnet, an efficient framework tailored for whole heart segmentation that combines CNNs and transformers. It provides a powerful multimodality fusion network to improve the performance of whole heart semantic segmentation. These results demonstrate the efficacy of SFAM-TransUnet in integrating relevant information between different modalities in multimodal tasks.

Abstract Image

一种用于多模态心脏分割的新型逆变变压器网络
近年来,深度卷积神经网络(cnn)在医学成像领域取得了巨大的成功。然而,利用单模态医学图像难以获得准确的病理信息,用于临床诊断和治疗。本研究旨在为冠心病的诊断提供一种高效的多模态全心分割方法。我们提出了SFAM-TransUnet用于多模态全心分割,这是一种结合cnn和变压器的新型深度学习框架。首先,该方法将cnn和视觉变换(Vits)融合到一个统一的融合框架中。其中,浅层特征融合模块用于连接MRI和CT图像,为语义分割提供了强大高效的多模态融合主干。此外,我们提出了一个融合ViT (FViT)模块,包括自我注意(SA)和自适应相互促进注意(Ada-MBA),以增强模态内部和跨模态的上下文信息。Ada-MBA模块通过计算SA和交叉注意将注意力分配到语义感知区域,从而提高了从不同模态理解上下文的能力。在临床多模态全心分割数据集上进行了大量的实验。结果全心分割dsc分别为0.902 (AA)、0.920 (LV-blood)、0.863 (LA-blood)、0.837 (LV-myo), HDs分别为9.886 (AA)、9.947 (LV-blood)、11.911 (LA-blood)、13.599 (LV-myo), PSNR分别为33.577 (AA)、30.091 (LV-blood)、32.055 (LA-blood)、29.837 (LV-myo), SSMI分别为0.901 (AA)、0.818 (LV-blood)、0.765 (LA-blood)、0.743 (LV-myo)。这表明SFAM-TransUnet优于各种替代方法。我们提出了SFAM-TransUnet,这是一种结合cnn和变压器的高效全心分割框架。它提供了一个强大的多模态融合网络,以提高全心语义分割的性能。这些结果证明了SFAM-TransUnet在多模态任务中整合不同模态之间相关信息的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书