用于文本到图像自动生成的多句子分层生成对抗网络GAN (MSH-GAN)

Journal of Artificial Intelligence and Data Mining Pub Date : 2021-08-31 DOI:10.22044/JADM.2021.10837.2224

Elham Pejhan, M. Ghasemzadeh

{"title":"用于文本到图像自动生成的多句子分层生成对抗网络GAN (MSH-GAN)","authors":"Elham Pejhan, M. Ghasemzadeh","doi":"10.22044/JADM.2021.10837.2224","DOIUrl":null,"url":null,"abstract":"This research is related to the development of technology in the field of automatic text to image generation. In this regard, two main goals are pursued; first, the generated image should look as real as possible; and second, the generated image should be a meaningful description of the input text. our proposed method is a Multi Sentences Hierarchical GAN (MSH-GAN) for text to image generation. In this research project, we have considered two main strategies: 1) produce a higher quality image in the first step, and 2) use two additional descriptions to improve the original image in the next steps. Our goal is to focus on using more information to generate images with higher resolution by using more than one sentence input text. We have proposed different models based on GANs and Memory Networks. We have also used more challenging dataset called ids-ade. This is the first time; this dataset has been used in this area. We have evaluated our models based on IS, FID and, R-precision evaluation metrics. Experimental results demonstrate that our best model performs favorably against the basic state-of-the-art approaches like StackGAN and AttGAN.","PeriodicalId":32592,"journal":{"name":"Journal of Artificial Intelligence and Data Mining","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Sentence Hierarchical Generative Adversarial Network GAN (MSH-GAN) for Automatic Text-to-Image Generation\",\"authors\":\"Elham Pejhan, M. Ghasemzadeh\",\"doi\":\"10.22044/JADM.2021.10837.2224\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This research is related to the development of technology in the field of automatic text to image generation. In this regard, two main goals are pursued; first, the generated image should look as real as possible; and second, the generated image should be a meaningful description of the input text. our proposed method is a Multi Sentences Hierarchical GAN (MSH-GAN) for text to image generation. In this research project, we have considered two main strategies: 1) produce a higher quality image in the first step, and 2) use two additional descriptions to improve the original image in the next steps. Our goal is to focus on using more information to generate images with higher resolution by using more than one sentence input text. We have proposed different models based on GANs and Memory Networks. We have also used more challenging dataset called ids-ade. This is the first time; this dataset has been used in this area. We have evaluated our models based on IS, FID and, R-precision evaluation metrics. Experimental results demonstrate that our best model performs favorably against the basic state-of-the-art approaches like StackGAN and AttGAN.\",\"PeriodicalId\":32592,\"journal\":{\"name\":\"Journal of Artificial Intelligence and Data Mining\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Artificial Intelligence and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22044/JADM.2021.10837.2224\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22044/JADM.2021.10837.2224","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本研究涉及文本到图像自动生成领域的技术发展。在这方面，我们追求两个主要目标；首先，生成的图像应该看起来尽可能真实；第二，生成的图像应该是对输入文本的有意义的描述。我们提出的方法是一种用于文本到图像生成的多句子层次GAN（MSH-GAN）。在这个研究项目中，我们考虑了两个主要策略：1）在第一步中生成更高质量的图像，2）在接下来的步骤中使用两个额外的描述来改进原始图像。我们的目标是专注于使用更多的信息，通过使用多个句子输入文本来生成具有更高分辨率的图像。我们提出了基于GANs和内存网络的不同模型。我们还使用了更具挑战性的数据集，称为ids ade。这是第一次；该数据集已用于该领域。我们根据IS、FID和R精度评估指标对我们的模型进行了评估。实验结果表明，与StackGAN和AttGAN等最先进的基本方法相比，我们的最佳模型表现良好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-Sentence Hierarchical Generative Adversarial Network GAN (MSH-GAN) for Automatic Text-to-Image Generation

This research is related to the development of technology in the field of automatic text to image generation. In this regard, two main goals are pursued; first, the generated image should look as real as possible; and second, the generated image should be a meaningful description of the input text. our proposed method is a Multi Sentences Hierarchical GAN (MSH-GAN) for text to image generation. In this research project, we have considered two main strategies: 1) produce a higher quality image in the first step, and 2) use two additional descriptions to improve the original image in the next steps. Our goal is to focus on using more information to generate images with higher resolution by using more than one sentence input text. We have proposed different models based on GANs and Memory Networks. We have also used more challenging dataset called ids-ade. This is the first time; this dataset has been used in this area. We have evaluated our models based on IS, FID and, R-precision evaluation metrics. Experimental results demonstrate that our best model performs favorably against the basic state-of-the-art approaches like StackGAN and AttGAN.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Artificial Intelligence and Data Mining

自引率

0.00%

发文量

审稿时长

8 weeks