A Comprehensive Survey of Image Generation Models Based on Deep Learning

Q1 Decision Sciences

Annals of Data Science Pub Date : 2024-06-20 DOI:10.1007/s40745-024-00544-1

Jun Li, Chenyang Zhang, Wei Zhu, Yawei Ren

{"title":"A Comprehensive Survey of Image Generation Models Based on Deep Learning","authors":"Jun Li, Chenyang Zhang, Wei Zhu, Yawei Ren","doi":"10.1007/s40745-024-00544-1","DOIUrl":null,"url":null,"abstract":"<div><p>In recent years, generative artificial intelligence has been developing rapidly. In the image domain, image generation models based on deep learning have made remarkable achievements. Early frameworks for image generation models were dominated by generative adversarial networks (GANs) and variational autoencoders (VAEs). Nowadays, large-scale generative models based on diffusion models have become mainstream, and the quality of their generated images is significantly improved. We will review the research and development of image generation models and delve into the significant progress made in the field in recent years. Initially, we revisit the development of traditional image generation models like GANs and VAEs, emphasizing their contributions and challenges. We also introduce diffusion models, which have received much attention in the field of image generation due to their unique generative process and excellent generative performance. Subsequently, we emphasized the large vision models with SAM as the focal point. We also pay special attention to large-scale generative models like Stable Diffusion, which have demonstrated unprecedented capabilities in high-quality image generation tasks. Additionally, we explore target models and respective fine-tuning methods for domain-oriented image generation tasks, predicts future directions in image generation, and proposes potential research focuses and challenges.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 1","pages":"141 - 170"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Data Science","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s40745-024-00544-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Decision Sciences","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, generative artificial intelligence has been developing rapidly. In the image domain, image generation models based on deep learning have made remarkable achievements. Early frameworks for image generation models were dominated by generative adversarial networks (GANs) and variational autoencoders (VAEs). Nowadays, large-scale generative models based on diffusion models have become mainstream, and the quality of their generated images is significantly improved. We will review the research and development of image generation models and delve into the significant progress made in the field in recent years. Initially, we revisit the development of traditional image generation models like GANs and VAEs, emphasizing their contributions and challenges. We also introduce diffusion models, which have received much attention in the field of image generation due to their unique generative process and excellent generative performance. Subsequently, we emphasized the large vision models with SAM as the focal point. We also pay special attention to large-scale generative models like Stable Diffusion, which have demonstrated unprecedented capabilities in high-quality image generation tasks. Additionally, we explore target models and respective fine-tuning methods for domain-oriented image generation tasks, predicts future directions in image generation, and proposes potential research focuses and challenges.

查看原文本刊更多论文

基于深度学习的图像生成模型综述

近年来，生成式人工智能得到了迅速发展。在图像领域，基于深度学习的图像生成模型取得了令人瞩目的成就。早期的图像生成模型框架由生成对抗网络（GANs）和变分自编码器（VAEs）主导。目前，基于扩散模型的大规模生成模型已经成为主流，其生成的图像质量得到了显著提高。我们将回顾图像生成模型的研究和发展，并深入研究近年来在该领域取得的重大进展。首先，我们回顾了gan和VAEs等传统图像生成模型的发展，强调了它们的贡献和挑战。扩散模型因其独特的生成过程和优异的生成性能在图像生成领域受到广泛关注。随后，我们强调了以SAM为重点的大视觉模型。我们还特别关注像Stable Diffusion这样的大规模生成模型，它在高质量图像生成任务中展示了前所未有的能力。此外，我们探讨了面向领域的图像生成任务的目标模型和相应的微调方法，预测了图像生成的未来方向，并提出了潜在的研究重点和挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annals of Data Science Decision Sciences-Statistics, Probability and Uncertainty

CiteScore

6.50

自引率

0.00%

发文量

期刊介绍： Annals of Data Science (ADS) publishes cutting-edge research findings, experimental results and case studies of data science. Although Data Science is regarded as an interdisciplinary field of using mathematics, statistics, databases, data mining, high-performance computing, knowledge management and virtualization to discover knowledge from Big Data, it should have its own scientific contents, such as axioms, laws and rules, which are fundamentally important for experts in different fields to explore their own interests from Big Data. ADS encourages contributors to address such challenging problems at this exchange platform. At present, how to discover knowledge from heterogeneous data under Big Data environment needs to be addressed. ADS is a series of volumes edited by either the editorial office or guest editors. Guest editors will be responsible for call-for-papers and the review process for high-quality contributions in their volumes.