Multi-ConDoS: Multimodal Contrastive Domain Sharing Generative Adversarial Networks for Self-Supervised Medical Image Segmentation

IEEE transactions on medical imaging Pub Date : 2023-06-28 DOI:10.1109/TMI.2023.3290356

Jiaojiao Zhang;Shuo Zhang;Xiaoqian Shen;Thomas Lukasiewicz;Zhenghua Xu

{"title":"Multi-ConDoS: Multimodal Contrastive Domain Sharing Generative Adversarial Networks for Self-Supervised Medical Image Segmentation","authors":"Jiaojiao Zhang;Shuo Zhang;Xiaoqian Shen;Thomas Lukasiewicz;Zhenghua Xu","doi":"10.1109/TMI.2023.3290356","DOIUrl":null,"url":null,"abstract":"Existing self-supervised medical image segmentation usually encounters the domain shift problem (i.e., the input distribution of pre-training is different from that of fine-tuning) and/or the multimodality problem (i.e., it is based on single-modal data only and cannot utilize the fruitful multimodal information of medical images). To solve these problems, in this work, we propose multimodal contrastive domain sharing (Multi-ConDoS) generative adversarial networks to achieve effective multimodal contrastive self-supervised medical image segmentation. Compared to the existing self-supervised approaches, Multi-ConDoS has the following three advantages: (i) it utilizes multimodal medical images to learn more comprehensive object features via multimodal contrastive learning; (ii) domain translation is achieved by integrating the cyclic learning strategy of CycleGAN and the cross-domain translation loss of Pix2Pix; (iii) novel domain sharing layers are introduced to learn not only domain-specific but also domain-sharing information from the multimodal medical images. Extensive experiments on two publicly multimodal medical image segmentation datasets show that, with only 5% (resp., 10%) of labeled data, Multi-ConDoS not only greatly outperforms the state-of-the-art self-supervised and semi-supervised medical image segmentation baselines with the same ratio of labeled data, but also achieves similar (sometimes even better) performances as fully supervised segmentation methods with 50% (resp., 100%) of labeled data, which thus proves that our work can achieve superior segmentation performances with very low labeling workload. Furthermore, ablation studies prove that the above three improvements are all effective and essential for Multi-ConDoS to achieve this very superior performance.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 1","pages":"76-95"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10167829/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Existing self-supervised medical image segmentation usually encounters the domain shift problem (i.e., the input distribution of pre-training is different from that of fine-tuning) and/or the multimodality problem (i.e., it is based on single-modal data only and cannot utilize the fruitful multimodal information of medical images). To solve these problems, in this work, we propose multimodal contrastive domain sharing (Multi-ConDoS) generative adversarial networks to achieve effective multimodal contrastive self-supervised medical image segmentation. Compared to the existing self-supervised approaches, Multi-ConDoS has the following three advantages: (i) it utilizes multimodal medical images to learn more comprehensive object features via multimodal contrastive learning; (ii) domain translation is achieved by integrating the cyclic learning strategy of CycleGAN and the cross-domain translation loss of Pix2Pix; (iii) novel domain sharing layers are introduced to learn not only domain-specific but also domain-sharing information from the multimodal medical images. Extensive experiments on two publicly multimodal medical image segmentation datasets show that, with only 5% (resp., 10%) of labeled data, Multi-ConDoS not only greatly outperforms the state-of-the-art self-supervised and semi-supervised medical image segmentation baselines with the same ratio of labeled data, but also achieves similar (sometimes even better) performances as fully supervised segmentation methods with 50% (resp., 100%) of labeled data, which thus proves that our work can achieve superior segmentation performances with very low labeling workload. Furthermore, ablation studies prove that the above three improvements are all effective and essential for Multi-ConDoS to achieve this very superior performance.

查看原文本刊更多论文

Multi-ConDoS：用于自我监督医学图像分割的多模态对比域共享生成对抗网络

现有的自监督医学影像分割通常会遇到域偏移问题（即预训练的输入分布与微调的输入分布不同）和/或多模态问题（即仅基于单模态数据，无法利用医学影像中丰富的多模态信息）。为了解决这些问题，我们在本研究中提出了多模态对比域共享（Multi-ConDoS）生成式对抗网络，以实现有效的多模态对比自监督医学图像分割。与现有的自监督方法相比，Multi-ConDoS 具有以下三个优势：(i) 它利用多模态医学图像，通过多模态对比学习来学习更全面的对象特征；(ii) 通过整合 CycleGAN 的循环学习策略和 Pix2Pix 的跨域平移损失来实现域平移；(iii) 引入新型域共享层，不仅能从多模态医学图像中学习特定域信息，还能学习域共享信息。在两个公开的多模态医学图像分割数据集上进行的广泛实验表明，在仅使用 5%（或 10%）标记数据的情况下，Multi-ConDoS 不仅大大优于使用相同比例标记数据的最先进的自监督和半监督医学图像分割基线，而且在使用 50%（或 100%）标记数据的情况下，也取得了与完全监督分割方法相似（有时甚至更好）的性能，从而证明了我们的工作可以在极低的标记工作量下取得卓越的分割性能。此外，消融研究证明，上述三项改进都是有效的，也是 Multi-ConDoS 实现这一卓越性能的关键。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on medical imaging

自引率

0.00%

发文量