CardSegNet：用于心脏核磁共振成像中心脏区域分割的自适应混合 CNN 视觉变换器模型

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics Pub Date : 2024-04-16 DOI:10.1016/j.compmedimag.2024.102382

Hamed Aghapanah , Reza Rasti , Saeed Kermani , Faezeh Tabesh , Hossein Yousefi Banaem , Hamidreza Pour Aliakbar , Hamid Sanei , William Paul Segars

{"title":"CardSegNet：用于心脏核磁共振成像中心脏区域分割的自适应混合 CNN 视觉变换器模型","authors":"Hamed Aghapanah , Reza Rasti , Saeed Kermani , Faezeh Tabesh , Hossein Yousefi Banaem , Hamidreza Pour Aliakbar , Hamid Sanei , William Paul Segars","doi":"10.1016/j.compmedimag.2024.102382","DOIUrl":null,"url":null,"abstract":"<div><p>Cardiovascular MRI (CMRI) is a non-invasive imaging technique adopted for assessing the blood circulatory system’s structure and function. Precise image segmentation is required to measure cardiac parameters and diagnose abnormalities through CMRI data. Because of anatomical heterogeneity and image variations, cardiac image segmentation is a challenging task. Quantification of cardiac parameters requires high-performance segmentation of the left ventricle (LV), right ventricle (RV), and left ventricle myocardium from the background. The first proposed solution here is to manually segment the regions, which is a time-consuming and error-prone procedure. In this context, many semi- or fully automatic solutions have been proposed recently, among which deep learning-based methods have revealed high performance in segmenting regions in CMRI data. In this study, a self-adaptive multi attention (SMA) module is introduced to adaptively leverage multiple attention mechanisms for better segmentation. The convolutional-based position and channel attention mechanisms with a patch tokenization-based vision transformer (ViT)-based attention mechanism in a hybrid and end-to-end manner are integrated into the SMA. The CNN- and ViT-based attentions mine the short- and long-range dependencies for more precise segmentation. The SMA module is applied in an encoder-decoder structure with a ResNet50 backbone named CardSegNet. Furthermore, a deep supervision method with multi-loss functions is introduced to the CardSegNet optimizer to reduce overfitting and enhance the model’s performance. The proposed model is validated on the ACDC2017 (n=100), M&Ms (n=321), and a local dataset (n=22) using the 10-fold cross-validation method with promising segmentation results, demonstrating its outperformance versus its counterparts.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"115 ","pages":"Article 102382"},"PeriodicalIF":5.4000,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CardSegNet: An adaptive hybrid CNN-vision transformer model for heart region segmentation in cardiac MRI\",\"authors\":\"Hamed Aghapanah , Reza Rasti , Saeed Kermani , Faezeh Tabesh , Hossein Yousefi Banaem , Hamidreza Pour Aliakbar , Hamid Sanei , William Paul Segars\",\"doi\":\"10.1016/j.compmedimag.2024.102382\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Cardiovascular MRI (CMRI) is a non-invasive imaging technique adopted for assessing the blood circulatory system’s structure and function. Precise image segmentation is required to measure cardiac parameters and diagnose abnormalities through CMRI data. Because of anatomical heterogeneity and image variations, cardiac image segmentation is a challenging task. Quantification of cardiac parameters requires high-performance segmentation of the left ventricle (LV), right ventricle (RV), and left ventricle myocardium from the background. The first proposed solution here is to manually segment the regions, which is a time-consuming and error-prone procedure. In this context, many semi- or fully automatic solutions have been proposed recently, among which deep learning-based methods have revealed high performance in segmenting regions in CMRI data. In this study, a self-adaptive multi attention (SMA) module is introduced to adaptively leverage multiple attention mechanisms for better segmentation. The convolutional-based position and channel attention mechanisms with a patch tokenization-based vision transformer (ViT)-based attention mechanism in a hybrid and end-to-end manner are integrated into the SMA. The CNN- and ViT-based attentions mine the short- and long-range dependencies for more precise segmentation. The SMA module is applied in an encoder-decoder structure with a ResNet50 backbone named CardSegNet. Furthermore, a deep supervision method with multi-loss functions is introduced to the CardSegNet optimizer to reduce overfitting and enhance the model’s performance. The proposed model is validated on the ACDC2017 (n=100), M&Ms (n=321), and a local dataset (n=22) using the 10-fold cross-validation method with promising segmentation results, demonstrating its outperformance versus its counterparts.</p></div>\",\"PeriodicalId\":50631,\"journal\":{\"name\":\"Computerized Medical Imaging and Graphics\",\"volume\":\"115 \",\"pages\":\"Article 102382\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2024-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computerized Medical Imaging and Graphics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0895611124000594\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computerized Medical Imaging and Graphics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895611124000594","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

摘要

心血管磁共振成像（CMRI）是一种无创成像技术，用于评估血液循环系统的结构和功能。通过 CMRI 数据测量心脏参数和诊断异常需要精确的图像分割。由于解剖异质性和图像变化，心脏图像分割是一项具有挑战性的任务。心脏参数的量化需要从背景中高性能地分割出左心室（LV）、右心室（RV）和左心室心肌。这里提出的第一个解决方案是手动分割区域，这是一个耗时且容易出错的过程。在这种情况下，最近提出了许多半自动或全自动的解决方案，其中基于深度学习的方法在 CMRI 数据的区域分割方面表现出色。在本研究中，引入了自适应多重注意（SMA）模块，以自适应地利用多重注意机制来获得更好的分割效果。基于卷积的位置和通道注意力机制与基于补丁标记化的视觉转换器（ViT）注意力机制以混合和端到端的方式集成到了 SMA 中。基于 CNN 和 ViT 的注意力挖掘短程和长程依赖关系，以实现更精确的分割。SMA 模块被应用于以 ResNet50 为骨干的编码器-解码器结构中，并命名为 CardSegNet。此外，CardSegNet 优化器还引入了具有多损失函数的深度监督方法，以减少过拟合并提高模型性能。利用 10 倍交叉验证法，在 ACDC2017（n=100）、M&Ms（n=321）和本地数据集（n=22）上对所提出的模型进行了验证，结果显示其分割效果优于同类模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

CardSegNet: An adaptive hybrid CNN-vision transformer model for heart region segmentation in cardiac MRI

查看原文本刊更多论文

CardSegNet: An adaptive hybrid CNN-vision transformer model for heart region segmentation in cardiac MRI

Cardiovascular MRI (CMRI) is a non-invasive imaging technique adopted for assessing the blood circulatory system’s structure and function. Precise image segmentation is required to measure cardiac parameters and diagnose abnormalities through CMRI data. Because of anatomical heterogeneity and image variations, cardiac image segmentation is a challenging task. Quantification of cardiac parameters requires high-performance segmentation of the left ventricle (LV), right ventricle (RV), and left ventricle myocardium from the background. The first proposed solution here is to manually segment the regions, which is a time-consuming and error-prone procedure. In this context, many semi- or fully automatic solutions have been proposed recently, among which deep learning-based methods have revealed high performance in segmenting regions in CMRI data. In this study, a self-adaptive multi attention (SMA) module is introduced to adaptively leverage multiple attention mechanisms for better segmentation. The convolutional-based position and channel attention mechanisms with a patch tokenization-based vision transformer (ViT)-based attention mechanism in a hybrid and end-to-end manner are integrated into the SMA. The CNN- and ViT-based attentions mine the short- and long-range dependencies for more precise segmentation. The SMA module is applied in an encoder-decoder structure with a ResNet50 backbone named CardSegNet. Furthermore, a deep supervision method with multi-loss functions is introduced to the CardSegNet optimizer to reduce overfitting and enhance the model’s performance. The proposed model is validated on the ACDC2017 (n=100), M&Ms (n=321), and a local dataset (n=22) using the 10-fold cross-validation method with promising segmentation results, demonstrating its outperformance versus its counterparts.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computerized Medical Imaging and Graphics 医学-核医学

CiteScore

10.70

自引率

3.50%

发文量

审稿时长

26 days

期刊介绍： The purpose of the journal Computerized Medical Imaging and Graphics is to act as a source for the exchange of research results concerning algorithmic advances, development, and application of digital imaging in disease detection, diagnosis, intervention, prevention, precision medicine, and population health. Included in the journal will be articles on novel computerized imaging or visualization techniques, including artificial intelligence and machine learning, augmented reality for surgical planning and guidance, big biomedical data visualization, computer-aided diagnosis, computerized-robotic surgery, image-guided therapy, imaging scanning and reconstruction, mobile and tele-imaging, radiomics, and imaging integration and modeling with other information relevant to digital health. The types of biomedical imaging include: magnetic resonance, computed tomography, ultrasound, nuclear medicine, X-ray, microwave, optical and multi-photon microscopy, video and sensory imaging, and the convergence of biomedical images with other non-imaging datasets.