文本到图像的翻译使用GAN与NLP和计算机视觉

IF 1.2 4区地球科学 Q3 GEOCHEMISTRY & GEOPHYSICS

Periodico Di Mineralogia Pub Date : 2022-04-01 DOI:10.37896/pd91.4/91449

R. Perumalraja., A. S. Arjunkumar, N. N. Mohamed, E. Siva, S. Kamalesh

{"title":"文本到图像的翻译使用GAN与NLP和计算机视觉","authors":"R. Perumalraja., A. S. Arjunkumar, N. N. Mohamed, E. Siva, S. Kamalesh","doi":"10.37896/pd91.4/91449","DOIUrl":null,"url":null,"abstract":"Generating high-quality images from text queries is a challenging problem in computer vision and has many practical applications. This paper proposes Stacked Generative Adversarial Networks (StackGAN) to generate 256 x 256 photo-realistic images conditioned on text descriptions. We resolve the hard problem into more manageable sub-problems through a sketch-refinement process. The Stage-I GAN gives the primitive shape and colors of the object based on the given text description, yielding Stage-I low-resolution images. The Stage-II GAN uses Stage-I results and text descriptions as inputs and generates high-resolution images with photorealistic details. It can correct defects in Stage-I results and add compelling details to the refinement process. To improve the generated images' variety and regulate the conditional-GAN training, we introduce a novel Conditioning Augmentation technique. Various experiments and comparisons with state-of-the-art benchmark datasets demonstrate that the proposed method achieves significant improvements in generating photo-realistic images conditioned on text queries.","PeriodicalId":20006,"journal":{"name":"Periodico Di Mineralogia","volume":"30 1","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Text to Image Translation using GAN with NLP and Computer Vision\",\"authors\":\"R. Perumalraja., A. S. Arjunkumar, N. N. Mohamed, E. Siva, S. Kamalesh\",\"doi\":\"10.37896/pd91.4/91449\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generating high-quality images from text queries is a challenging problem in computer vision and has many practical applications. This paper proposes Stacked Generative Adversarial Networks (StackGAN) to generate 256 x 256 photo-realistic images conditioned on text descriptions. We resolve the hard problem into more manageable sub-problems through a sketch-refinement process. The Stage-I GAN gives the primitive shape and colors of the object based on the given text description, yielding Stage-I low-resolution images. The Stage-II GAN uses Stage-I results and text descriptions as inputs and generates high-resolution images with photorealistic details. It can correct defects in Stage-I results and add compelling details to the refinement process. To improve the generated images' variety and regulate the conditional-GAN training, we introduce a novel Conditioning Augmentation technique. Various experiments and comparisons with state-of-the-art benchmark datasets demonstrate that the proposed method achieves significant improvements in generating photo-realistic images conditioned on text queries.\",\"PeriodicalId\":20006,\"journal\":{\"name\":\"Periodico Di Mineralogia\",\"volume\":\"30 1\",\"pages\":\"\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2022-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Periodico Di Mineralogia\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.37896/pd91.4/91449\",\"RegionNum\":4,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"GEOCHEMISTRY & GEOPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Periodico Di Mineralogia","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.37896/pd91.4/91449","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}

引用次数: 0

摘要

从文本查询中生成高质量图像是计算机视觉中的一个具有挑战性的问题，并且具有许多实际应用。本文提出了堆叠生成对抗网络(StackGAN)来生成256 × 256的基于文本描述的逼真图像。我们通过草图细化过程将难题分解为更易于管理的子问题。第一阶段的GAN根据给定的文本描述给出物体的原始形状和颜色，生成第一阶段的低分辨率图像。第二阶段GAN使用第一阶段的结果和文本描述作为输入，并生成具有逼真细节的高分辨率图像。它可以纠正第一阶段结果中的缺陷，并在细化过程中添加引人注目的细节。为了提高生成图像的多样性和调节条件gan训练，我们引入了一种新的条件增强技术。各种实验和与最先进的基准数据集的比较表明，该方法在生成以文本查询为条件的逼真图像方面取得了显着改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Text to Image Translation using GAN with NLP and Computer Vision

Generating high-quality images from text queries is a challenging problem in computer vision and has many practical applications. This paper proposes Stacked Generative Adversarial Networks (StackGAN) to generate 256 x 256 photo-realistic images conditioned on text descriptions. We resolve the hard problem into more manageable sub-problems through a sketch-refinement process. The Stage-I GAN gives the primitive shape and colors of the object based on the given text description, yielding Stage-I low-resolution images. The Stage-II GAN uses Stage-I results and text descriptions as inputs and generates high-resolution images with photorealistic details. It can correct defects in Stage-I results and add compelling details to the refinement process. To improve the generated images' variety and regulate the conditional-GAN training, we introduce a novel Conditioning Augmentation technique. Various experiments and comparisons with state-of-the-art benchmark datasets demonstrate that the proposed method achieves significant improvements in generating photo-realistic images conditioned on text queries.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Periodico Di Mineralogia 地学-地球化学与地球物理

CiteScore

1.50

自引率

14.30%

发文量

审稿时长

>12 weeks

期刊介绍： Periodico di Mineralogia is an international peer-reviewed Open Access journal publishing Research Articles, Letters and Reviews in Mineralogy, Crystallography, Geochemistry, Ore Deposits, Petrology, Volcanology and applied topics on Environment, Archaeometry and Cultural Heritage. The journal aims at encouraging scientists to publish their experimental and theoretical results in as much detail as possible. Accordingly, there is no restriction on article length. Additional data may be hosted on the web sites as Supplementary Information. The journal does not have article submission and processing charges. Colour is free of charges both on line and printed and no Open Access fees are requested. Short publication time is assured. Periodico di Mineralogia is property of Sapienza Università di Roma and is published, both online and printed, three times a year.