{"title":"CH-Net: A Cross Hybrid Network for Medical Image Segmentation","authors":"Jiale Li;Aiping Liu;Wei Wei;Ruobing Qian;Xun Chen","doi":"10.1109/TAI.2024.3503541","DOIUrl":null,"url":null,"abstract":"Accurate and automated segmentation of medical images plays a crucial role in diagnostic evaluation and treatment planning. In recent years, hybrid models have gained considerable popularity in diverse medical image segmentation tasks, as they leverage the benefits of both convolution and self-attention to capture local and global dependencies simultaneously. However, most existing hybrid models treat convolution and self-attention as independent components and integrate them using simple fusion methods, neglecting the potential complementary information between their weight allocation mechanisms. To address this issue, we propose a cross hybrid network (CH-Net) for medical image segmentation, in which convolution and self-attention are hybridized in a cross-collaborative manner. Specifically, we introduce a cross hybrid module (CHM) between the parallel convolution layer and self-attention layer in each building block of CH-Net. This module extracts attention with distinct dimensional information from convolution and self-attention, respectively, and uses this complementary information to enhance the feature representation of both components. In contrast to the traditional approach where each module learned independently, the CHM facilitates the interactive learning of complementary information between convolutional layer and self-attention layer, which significantly enhances the segmentation capabilities of the model. The superiority of our approach over various hybrid models is demonstrated through experimental evaluations conducted on three publicly available benchmarks: ACDC, synapse, and EM.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 4","pages":"934-944"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10758929/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate and automated segmentation of medical images plays a crucial role in diagnostic evaluation and treatment planning. In recent years, hybrid models have gained considerable popularity in diverse medical image segmentation tasks, as they leverage the benefits of both convolution and self-attention to capture local and global dependencies simultaneously. However, most existing hybrid models treat convolution and self-attention as independent components and integrate them using simple fusion methods, neglecting the potential complementary information between their weight allocation mechanisms. To address this issue, we propose a cross hybrid network (CH-Net) for medical image segmentation, in which convolution and self-attention are hybridized in a cross-collaborative manner. Specifically, we introduce a cross hybrid module (CHM) between the parallel convolution layer and self-attention layer in each building block of CH-Net. This module extracts attention with distinct dimensional information from convolution and self-attention, respectively, and uses this complementary information to enhance the feature representation of both components. In contrast to the traditional approach where each module learned independently, the CHM facilitates the interactive learning of complementary information between convolutional layer and self-attention layer, which significantly enhances the segmentation capabilities of the model. The superiority of our approach over various hybrid models is demonstrated through experimental evaluations conducted on three publicly available benchmarks: ACDC, synapse, and EM.