A Comparative Study of Generative Adversarial Networks for Text-to-Image Synthesis

Int. J. Softw. Sci. Comput. Intell. Pub Date : 2022-01-01 DOI:10.4018/ijssci.300364

M. Chopra, Sunil K. Singh, Akhil Sharma, Shabeg Singh Gill

引用次数: 2

Abstract

Text-to-picture alludes to the conversion of a textual description into a semantically similar image.The automatic synthesis of top-quality pictures from text portrayals is both exciting and useful at the same time.Current AI systems have shown significant advances in the field,but the work is still far from complete. Recent advances in the field of Deep Learning have resulted in the introduction of generative models that are capable of generating realistic images when trained appropriately.In this paper,authors will review the advancements in architectures for solving the problem of image synthesis using a text description.They begin by studying the concepts of the standard GAN, how the DCGAN has been used for the task at hand is followed by the StackGAN with uses a stack of two GANs to generate an image through iterative refinement & StackGAN++ which uses multiple GANs in a tree-like structure making the task of generating images from the text more generalized. They look at the AttnGAN which uses an attentional model to generate sub-regions of an image based on the description.

查看原文本刊更多论文

文本到图像合成的生成对抗网络的比较研究

文本到图片指的是将文本描述转换为语义相似的图像。从文字描述中自动合成高质量图片的功能令人兴奋，同时也很有用。目前的人工智能系统已经在该领域取得了重大进展，但工作还远远没有完成。深度学习领域的最新进展导致了生成模型的引入，这些模型在经过适当训练后能够生成逼真的图像。在本文中，作者将回顾使用文本描述解决图像合成问题的体系结构的进展。他们首先研究标准GAN的概念，如何将DCGAN用于手头的任务，然后是StackGAN，使用两个GAN的堆栈通过迭代细化生成图像;StackGAN++使用树形结构中的多个GAN，使从文本生成图像的任务更加一般化。他们观察AttnGAN，它使用注意力模型根据描述生成图像的子区域。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Int. J. Softw. Sci. Comput. Intell.

自引率

0.00%

发文量