{"title":"A Novel Deep Learning Model for Medical Image Segmentation with Convolutional Neural Network and Transformer.","authors":"Zhuo Zhang, Hongbing Wu, Huan Zhao, Yicheng Shi, Jifang Wang, Hua Bai, Baoshan Sun","doi":"10.1007/s12539-023-00585-9","DOIUrl":null,"url":null,"abstract":"<p><p>Accurate segmentation of medical images is essential for clinical decision-making, and deep learning techniques have shown remarkable results in this area. However, existing segmentation models that combine transformer and convolutional neural networks often use skip connections in U-shaped networks, which may limit their ability to capture contextual information in medical images. To address this limitation, we propose a coordinated mobile and residual transformer UNet (MRC-TransUNet) that combines the strengths of transformer and UNet architectures. Our approach uses a lightweight MR-ViT to address the semantic gap and a reciprocal attention module to compensate for the potential loss of details. To better explore long-range contextual information, we use skip connections only in the first layer and add MR-ViT and RPA modules in the subsequent downsampling layers. In our study, we evaluated the effectiveness of our proposed method on three different medical image segmentation datasets, namely, breast, brain, and lung. Our proposed method outperformed state-of-the-art methods in terms of various evaluation metrics, including the Dice coefficient and Hausdorff distance. These results demonstrate that our proposed method can significantly improve the accuracy of medical image segmentation and has the potential for clinical applications. Illustration of the proposed MRC-TransUNet. For the input medical images, we first subject them to an intrinsic downsampling operation and then replace the original jump connection structure using MR-ViT. The output feature representations at different scales are fused by the RPA module. Finally, an upsampling operation is performed to fuse the features to restore them to the same resolution as the input image.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"663-677"},"PeriodicalIF":3.9000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Interdisciplinary Sciences: Computational Life Sciences","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s12539-023-00585-9","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/9/4 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate segmentation of medical images is essential for clinical decision-making, and deep learning techniques have shown remarkable results in this area. However, existing segmentation models that combine transformer and convolutional neural networks often use skip connections in U-shaped networks, which may limit their ability to capture contextual information in medical images. To address this limitation, we propose a coordinated mobile and residual transformer UNet (MRC-TransUNet) that combines the strengths of transformer and UNet architectures. Our approach uses a lightweight MR-ViT to address the semantic gap and a reciprocal attention module to compensate for the potential loss of details. To better explore long-range contextual information, we use skip connections only in the first layer and add MR-ViT and RPA modules in the subsequent downsampling layers. In our study, we evaluated the effectiveness of our proposed method on three different medical image segmentation datasets, namely, breast, brain, and lung. Our proposed method outperformed state-of-the-art methods in terms of various evaluation metrics, including the Dice coefficient and Hausdorff distance. These results demonstrate that our proposed method can significantly improve the accuracy of medical image segmentation and has the potential for clinical applications. Illustration of the proposed MRC-TransUNet. For the input medical images, we first subject them to an intrinsic downsampling operation and then replace the original jump connection structure using MR-ViT. The output feature representations at different scales are fused by the RPA module. Finally, an upsampling operation is performed to fuse the features to restore them to the same resolution as the input image.
期刊介绍:
Interdisciplinary Sciences--Computational Life Sciences aims to cover the most recent and outstanding developments in interdisciplinary areas of sciences, especially focusing on computational life sciences, an area that is enjoying rapid development at the forefront of scientific research and technology.
The journal publishes original papers of significant general interest covering recent research and developments. Articles will be published rapidly by taking full advantage of internet technology for online submission and peer-reviewing of manuscripts, and then by publishing OnlineFirstTM through SpringerLink even before the issue is built or sent to the printer.
The editorial board consists of many leading scientists with international reputation, among others, Luc Montagnier (UNESCO, France), Dennis Salahub (University of Calgary, Canada), Weitao Yang (Duke University, USA). Prof. Dongqing Wei at the Shanghai Jiatong University is appointed as the editor-in-chief; he made important contributions in bioinformatics and computational physics and is best known for his ground-breaking works on the theory of ferroelectric liquids. With the help from a team of associate editors and the editorial board, an international journal with sound reputation shall be created.