Synthetic Scientific Image Generation with VAE, GAN, and Diffusion Model Architectures.

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

Journal of Imaging Pub Date : 2025-07-26 DOI:10.3390/jimaging11080252

Zineb Sordo, Eric Chagnon, Zixi Hu, Jeffrey J Donatelli, Peter Andeer, Peter S Nico, Trent Northen, Daniela Ushizima

{"title":"Synthetic Scientific Image Generation with VAE, GAN, and Diffusion Model Architectures.","authors":"Zineb Sordo, Eric Chagnon, Zixi Hu, Jeffrey J Donatelli, Peter Andeer, Peter S Nico, Trent Northen, Daniela Ushizima","doi":"10.3390/jimaging11080252","DOIUrl":null,"url":null,"abstract":"<p><p>Generative AI (genAI) has emerged as a powerful tool for synthesizing diverse and complex image data, offering new possibilities for scientific imaging applications. This review presents a comprehensive comparative analysis of leading generative architectures, ranging from Variational Autoencoders (VAEs) to Generative Adversarial Networks (GANs) on through to Diffusion Models, in the context of scientific image synthesis. We examine each model's foundational principles, recent architectural advancements, and practical trade-offs. Our evaluation, conducted on domain-specific datasets including microCT scans of rocks and composite fibers, as well as high-resolution images of plant roots, integrates both quantitative metrics (SSIM, LPIPS, FID, CLIPScore) and expert-driven qualitative assessments. Results show that GANs, particularly StyleGAN, produce images with high perceptual quality and structural coherence. Diffusion-based models for inpainting and image variation, such as DALL-E 2, delivered high realism and semantic alignment but generally struggled in balancing visual fidelity with scientific accuracy. Importantly, our findings reveal limitations of standard quantitative metrics in capturing scientific relevance, underscoring the need for domain-expert validation. We conclude by discussing key challenges such as model interpretability, computational cost, and verification protocols, and discuss future directions where generative AI can drive innovation in data augmentation, simulation, and hypothesis generation in scientific research.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 8","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12387873/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jimaging11080252","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Generative AI (genAI) has emerged as a powerful tool for synthesizing diverse and complex image data, offering new possibilities for scientific imaging applications. This review presents a comprehensive comparative analysis of leading generative architectures, ranging from Variational Autoencoders (VAEs) to Generative Adversarial Networks (GANs) on through to Diffusion Models, in the context of scientific image synthesis. We examine each model's foundational principles, recent architectural advancements, and practical trade-offs. Our evaluation, conducted on domain-specific datasets including microCT scans of rocks and composite fibers, as well as high-resolution images of plant roots, integrates both quantitative metrics (SSIM, LPIPS, FID, CLIPScore) and expert-driven qualitative assessments. Results show that GANs, particularly StyleGAN, produce images with high perceptual quality and structural coherence. Diffusion-based models for inpainting and image variation, such as DALL-E 2, delivered high realism and semantic alignment but generally struggled in balancing visual fidelity with scientific accuracy. Importantly, our findings reveal limitations of standard quantitative metrics in capturing scientific relevance, underscoring the need for domain-expert validation. We conclude by discussing key challenges such as model interpretability, computational cost, and verification protocols, and discuss future directions where generative AI can drive innovation in data augmentation, simulation, and hypothesis generation in scientific research.

查看原文本刊更多论文

合成科学图像生成与VAE， GAN和扩散模型架构。

生成式人工智能（genAI）已成为综合各种复杂图像数据的强大工具，为科学成像应用提供了新的可能性。本文综述了在科学图像合成的背景下，从变分自编码器（VAEs）到生成对抗网络（GANs）再到扩散模型，对领先的生成架构进行了全面的比较分析。我们研究了每个模型的基本原则、最近的架构进展和实际的权衡。我们的评估是在特定领域的数据集上进行的，包括岩石和复合纤维的微ct扫描，以及植物根系的高分辨率图像，整合了定量指标（SSIM， LPIPS， FID, CLIPScore）和专家驱动的定性评估。结果表明，以StyleGAN为代表的gan可以生成具有高感知质量和结构一致性的图像。用于绘制和图像变化的基于扩散的模型，如DALL-E 2，提供了高真实感和语义对齐，但通常难以平衡视觉保真度和科学准确性。重要的是，我们的发现揭示了标准定量指标在获取科学相关性方面的局限性，强调了领域专家验证的必要性。最后，我们讨论了模型可解释性、计算成本和验证协议等关键挑战，并讨论了生成式人工智能在科学研究中推动数据增强、模拟和假设生成方面的创新的未来方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊