Autoencoder-based conditional optimal transport generative adversarial network for medical image generation

IF 3.8 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Visual Informatics Pub Date : 2024-03-01 DOI:10.1016/j.visinf.2023.11.001

Jun Wang , Bohan Lei , Liya Ding , Xiaoyin Xu , Xianfeng Gu , Min Zhang

{"title":"Autoencoder-based conditional optimal transport generative adversarial network for medical image generation","authors":"Jun Wang , Bohan Lei , Liya Ding , Xiaoyin Xu , Xianfeng Gu , Min Zhang","doi":"10.1016/j.visinf.2023.11.001","DOIUrl":null,"url":null,"abstract":"<div><p>Medical image generation has recently garnered significant interest among researchers. However, the primary generative models, such as Generative Adversarial Networks (GANs), often encounter challenges during training, including mode collapse. To address these issues, we proposed the AE-COT-GAN model (Autoencoder-based Conditional Optimal Transport Generative Adversarial Network) for the generation of medical images belonging to specific categories. The training process of our model comprises three fundamental components. The training process of our model encompasses three fundamental components. First, we employ an autoencoder model to obtain a low-dimensional manifold representation of real images. Second, we apply extended semi-discrete optimal transport to map Gaussian noise distribution to the latent space distribution and obtain corresponding labels effectively. This procedure leads to the generation of new latent codes with known labels. Finally, we integrate a GAN to train the decoder further to generate medical images. To evaluate the performance of the AE-COT-GAN model, we conducted experiments on two medical image datasets, namely DermaMNIST and BloodMNIST. The model’s performance was compared with state-of-the-art generative models. Results show that the AE-COT-GAN model had excellent performance in generating medical images. Moreover, it effectively addressed the common issues associated with traditional GANs.</p></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 1","pages":"Pages 15-25"},"PeriodicalIF":3.8000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468502X23000529/pdfft?md5=3af566b28e15f895521e10dc5d8d1dbc&pid=1-s2.0-S2468502X23000529-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Visual Informatics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2468502X23000529","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Medical image generation has recently garnered significant interest among researchers. However, the primary generative models, such as Generative Adversarial Networks (GANs), often encounter challenges during training, including mode collapse. To address these issues, we proposed the AE-COT-GAN model (Autoencoder-based Conditional Optimal Transport Generative Adversarial Network) for the generation of medical images belonging to specific categories. The training process of our model comprises three fundamental components. The training process of our model encompasses three fundamental components. First, we employ an autoencoder model to obtain a low-dimensional manifold representation of real images. Second, we apply extended semi-discrete optimal transport to map Gaussian noise distribution to the latent space distribution and obtain corresponding labels effectively. This procedure leads to the generation of new latent codes with known labels. Finally, we integrate a GAN to train the decoder further to generate medical images. To evaluate the performance of the AE-COT-GAN model, we conducted experiments on two medical image datasets, namely DermaMNIST and BloodMNIST. The model’s performance was compared with state-of-the-art generative models. Results show that the AE-COT-GAN model had excellent performance in generating medical images. Moreover, it effectively addressed the common issues associated with traditional GANs.

查看原文本刊更多论文

基于自动编码器的条件优化传输生成对抗网络用于医学图像生成

医学图像生成最近引起了研究人员的极大兴趣。然而，主要的生成模型，如生成对抗网络（GAN），在训练过程中经常会遇到模式崩溃等挑战。为了解决这些问题，我们提出了 AE-COT-GAN 模型（基于自动编码器的条件优化传输生成对抗网络），用于生成属于特定类别的医学图像。我们模型的训练过程包括三个基本组成部分。我们模型的训练过程包括三个基本组成部分。首先，我们采用自动编码器模型获得真实图像的低维流形表示。其次，我们应用扩展的半离散最优传输将高斯噪声分布映射到潜空间分布，并有效地获得相应的标签。这一过程可生成带有已知标签的新潜码。最后，我们整合了一个 GAN 来进一步训练解码器，以生成医学图像。为了评估 AE-COT-GAN 模型的性能，我们在两个医学图像数据集（即 DermaMNIST 和 BloodMNIST）上进行了实验。我们将该模型的性能与最先进的生成模型进行了比较。结果表明，AE-COT-GAN 模型在生成医学图像方面表现出色。此外，它还有效地解决了与传统 GAN 相关的常见问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊