CcGL-GAN: Criss-Cross Attention and Global-Local Discriminator Generative Adversarial Networks for text-to-image synthesis

Xihong Ye, Luanhao Lu
{"title":"CcGL-GAN: Criss-Cross Attention and Global-Local Discriminator Generative Adversarial Networks for text-to-image synthesis","authors":"Xihong Ye, Luanhao Lu","doi":"10.1109/IJCNN52387.2021.9533396","DOIUrl":null,"url":null,"abstract":"Text-to-image synthesis aims to generate a visually realistic image according to a linguistic text description. Visual quality and semantic consistency are two key objectives. Although remarkable progress has been made in improving visual resolutions leveraging Generative Adversarial Networks (GANs), guaranteeing the semantic conformity remains challenging. In this paper, we address it by proposing a novel Criss-Cross Attention and Global-Local Discriminator Generative Adversarial Networks(CcGL-GAN). CcGL-GAN exploits a Criss-Cross Attention mechanism to capture the variation of contextual description, which enables back generators to generate images more efficiently. Moreover, it utilizes Global-Local discriminators to project low-resolution images onto global linguistic representations, and high-resolution images onto local linguistic representations, which ensures that our model narrows the gap between images and descriptions. Experiments conducted on two publicly available datasets, the CUB and Oxford-102, demonstrate the effectiveness of the proposed CcGL-GAN model.","PeriodicalId":396583,"journal":{"name":"2021 International Joint Conference on Neural Networks (IJCNN)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN52387.2021.9533396","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Text-to-image synthesis aims to generate a visually realistic image according to a linguistic text description. Visual quality and semantic consistency are two key objectives. Although remarkable progress has been made in improving visual resolutions leveraging Generative Adversarial Networks (GANs), guaranteeing the semantic conformity remains challenging. In this paper, we address it by proposing a novel Criss-Cross Attention and Global-Local Discriminator Generative Adversarial Networks(CcGL-GAN). CcGL-GAN exploits a Criss-Cross Attention mechanism to capture the variation of contextual description, which enables back generators to generate images more efficiently. Moreover, it utilizes Global-Local discriminators to project low-resolution images onto global linguistic representations, and high-resolution images onto local linguistic representations, which ensures that our model narrows the gap between images and descriptions. Experiments conducted on two publicly available datasets, the CUB and Oxford-102, demonstrate the effectiveness of the proposed CcGL-GAN model.
CcGL-GAN:用于文本到图像合成的交叉注意和全局-局部鉴别器生成对抗网络
文本-图像合成的目的是根据语言文本描述生成视觉逼真的图像。视觉质量和语义一致性是两个关键目标。尽管利用生成对抗网络(gan)在提高视觉分辨率方面取得了显著进展,但保证语义一致性仍然具有挑战性。在本文中,我们通过提出一种新的交叉注意和全局-局部判别器生成对抗网络(CcGL-GAN)来解决这个问题。CcGL-GAN利用一种交叉注意机制来捕捉上下文描述的变化,从而使反向生成器能够更有效地生成图像。此外,它利用全局-局部判别器将低分辨率图像投影到全局语言表示中,将高分辨率图像投影到局部语言表示中,从而确保我们的模型缩小了图像和描述之间的差距。在两个公开可用的数据集CUB和Oxford-102上进行的实验证明了所提出的CcGL-GAN模型的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信