{"title":"Multi-Sentence Hierarchical Generative Adversarial Network GAN (MSH-GAN) for Automatic Text-to-Image Generation","authors":"Elham Pejhan, M. Ghasemzadeh","doi":"10.22044/JADM.2021.10837.2224","DOIUrl":null,"url":null,"abstract":"This research is related to the development of technology in the field of automatic text to image generation. In this regard, two main goals are pursued; first, the generated image should look as real as possible; and second, the generated image should be a meaningful description of the input text. our proposed method is a Multi Sentences Hierarchical GAN (MSH-GAN) for text to image generation. In this research project, we have considered two main strategies: 1) produce a higher quality image in the first step, and 2) use two additional descriptions to improve the original image in the next steps. Our goal is to focus on using more information to generate images with higher resolution by using more than one sentence input text. We have proposed different models based on GANs and Memory Networks. We have also used more challenging dataset called ids-ade. This is the first time; this dataset has been used in this area. We have evaluated our models based on IS, FID and, R-precision evaluation metrics. Experimental results demonstrate that our best model performs favorably against the basic state-of-the-art approaches like StackGAN and AttGAN.","PeriodicalId":32592,"journal":{"name":"Journal of Artificial Intelligence and Data Mining","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22044/JADM.2021.10837.2224","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This research is related to the development of technology in the field of automatic text to image generation. In this regard, two main goals are pursued; first, the generated image should look as real as possible; and second, the generated image should be a meaningful description of the input text. our proposed method is a Multi Sentences Hierarchical GAN (MSH-GAN) for text to image generation. In this research project, we have considered two main strategies: 1) produce a higher quality image in the first step, and 2) use two additional descriptions to improve the original image in the next steps. Our goal is to focus on using more information to generate images with higher resolution by using more than one sentence input text. We have proposed different models based on GANs and Memory Networks. We have also used more challenging dataset called ids-ade. This is the first time; this dataset has been used in this area. We have evaluated our models based on IS, FID and, R-precision evaluation metrics. Experimental results demonstrate that our best model performs favorably against the basic state-of-the-art approaches like StackGAN and AttGAN.