CC-TransXNet: a hybrid CNN-transformer network for automatic segmentation of optic cup and optic disk from fundus images.

IF 2.6 4区医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Medical & Biological Engineering & Computing Pub Date : 2025-04-01 Epub Date: 2024-11-27 DOI:10.1007/s11517-024-03244-3

Zhongzheng Yuan, Jinke Wang, Yukun Xu, Min Xu

{"title":"CC-TransXNet: a hybrid CNN-transformer network for automatic segmentation of optic cup and optic disk from fundus images.","authors":"Zhongzheng Yuan, Jinke Wang, Yukun Xu, Min Xu","doi":"10.1007/s11517-024-03244-3","DOIUrl":null,"url":null,"abstract":"<p><p>Accurate segmentation of the optic disk (OD) and optic cup (OC) regions of the optic nerve head is a critical step in glaucoma diagnosis. Existing architectures based on convolutional neural networks (CNNs) still suffer from insufficient global information and poor generalization ability to small sample datasets. Besides, advanced transformer-based models, although capable of capturing global image features, perform poorly in medical image segmentation due to numerous parameters and insufficient local spatial information. To address the above two problems, we propose an innovative W-shaped hybrid network framework, CC-TransXNet, which combines the advantages of CNN and transformer. Firstly, by employing TransXNet and improved ResNet as feature extraction modules, the network considers local and global features to enhance its generalization ability. Secondly, the convolutional block attention module (CBAM) is introduced in the residual structure to improve the ability to recognize the OD and OC by applying attention in both the channel and spatial dimensions. Thirdly, the Contextual Attention (CoT) self-attention mechanism is used in the skip connection to adaptively allocate attention to the contextual information, further enhancing the segmentation's accuracy. We conducted experiments on four publicly available datasets (REFUGE 2, RIM-ONE DL, GAMMA, and Drishti-GS). Compared with the traditional U-Net, CNN, and transformer-based networks, our proposed CC-TransXNet improves the segmentation accuracy and significantly enhances the generalization ability on small datasets. Moreover, CC-TransXNet effectively controls the number of parameters in the model through optimized design to avoid the risk of overfitting, proving its potential for efficient segmentation.</p>","PeriodicalId":49840,"journal":{"name":"Medical & Biological Engineering & Computing","volume":" ","pages":"1027-1044"},"PeriodicalIF":2.6000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical & Biological Engineering & Computing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11517-024-03244-3","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/27 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate segmentation of the optic disk (OD) and optic cup (OC) regions of the optic nerve head is a critical step in glaucoma diagnosis. Existing architectures based on convolutional neural networks (CNNs) still suffer from insufficient global information and poor generalization ability to small sample datasets. Besides, advanced transformer-based models, although capable of capturing global image features, perform poorly in medical image segmentation due to numerous parameters and insufficient local spatial information. To address the above two problems, we propose an innovative W-shaped hybrid network framework, CC-TransXNet, which combines the advantages of CNN and transformer. Firstly, by employing TransXNet and improved ResNet as feature extraction modules, the network considers local and global features to enhance its generalization ability. Secondly, the convolutional block attention module (CBAM) is introduced in the residual structure to improve the ability to recognize the OD and OC by applying attention in both the channel and spatial dimensions. Thirdly, the Contextual Attention (CoT) self-attention mechanism is used in the skip connection to adaptively allocate attention to the contextual information, further enhancing the segmentation's accuracy. We conducted experiments on four publicly available datasets (REFUGE 2, RIM-ONE DL, GAMMA, and Drishti-GS). Compared with the traditional U-Net, CNN, and transformer-based networks, our proposed CC-TransXNet improves the segmentation accuracy and significantly enhances the generalization ability on small datasets. Moreover, CC-TransXNet effectively controls the number of parameters in the model through optimized design to avoid the risk of overfitting, proving its potential for efficient segmentation.

查看原文本刊更多论文

CC-TransXNet：用于从眼底图像自动分割视杯和视盘的混合 CNN 变换器网络。

准确分割视神经头的视盘（OD）和视杯（OC）区域是诊断青光眼的关键步骤。现有的基于卷积神经网络（CNN）的架构仍然存在全局信息不足和对小样本数据集的泛化能力差的问题。此外，基于变压器的高级模型虽然能捕捉全局图像特征，但由于参数繁多和局部空间信息不足，在医学图像分割中表现不佳。针对上述两个问题，我们提出了一种创新的 W 型混合网络框架--CC-TransXNet，它结合了 CNN 和变换器的优点。首先，通过使用 TransXNet 和改进的 ResNet 作为特征提取模块，该网络考虑了局部和全局特征，从而增强了泛化能力。其次，在残差结构中引入卷积块注意力模块（CBAM），通过在信道和空间维度上应用注意力，提高识别 OD 和 OC 的能力。第三，在跳转连接中使用上下文注意（CoT）自我注意机制，自适应地分配对上下文信息的注意，进一步提高分割的准确性。我们在四个公开数据集（REFUGE 2、RIM-ONE DL、GAMMA 和 Drishti-GS）上进行了实验。与传统的 U-Net、CNN 和基于变换器的网络相比，我们提出的 CC-TransXNet 提高了分割精度，并显著增强了对小型数据集的泛化能力。此外，CC-TransXNet 通过优化设计有效控制了模型中的参数数量，避免了过拟合的风险，证明了其在高效分割方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Medical & Biological Engineering & Computing 医学-工程：生物医学

CiteScore

6.00

自引率

3.10%

发文量

249

审稿时长

3.5 months

期刊介绍： Founded in 1963, Medical & Biological Engineering & Computing (MBEC) continues to serve the biomedical engineering community, covering the entire spectrum of biomedical and clinical engineering. The journal presents exciting and vital experimental and theoretical developments in biomedical science and technology, and reports on advances in computer-based methodologies in these multidisciplinary subjects. The journal also incorporates new and evolving technologies including cellular engineering and molecular imaging. MBEC publishes original research articles as well as reviews and technical notes. Its Rapid Communications category focuses on material of immediate value to the readership, while the Controversies section provides a forum to exchange views on selected issues, stimulating a vigorous and informed debate in this exciting and high profile field. MBEC is an official journal of the International Federation of Medical and Biological Engineering (IFMBE).