Generative Adversarial Networks with Learnable Auxiliary Module for Image Synthesis

IF 5.2 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Yan Gan, Chenxue Yang, Mao Ye, Renjie Huang, Deqiang Ouyang
{"title":"Generative Adversarial Networks with Learnable Auxiliary Module for Image Synthesis","authors":"Yan Gan, Chenxue Yang, Mao Ye, Renjie Huang, Deqiang Ouyang","doi":"10.1145/3653021","DOIUrl":null,"url":null,"abstract":"<p>Training generative adversarial networks (GANs) for noise-to-image synthesis is a challenge task, primarily due to the instability of GANs’ training process. One of the key issues is the generator’s sensitivity to input data, which can cause sudden fluctuations in the generator’s loss value with certain inputs. This sensitivity suggests an inadequate ability to resist disturbances in the generator, causing the discriminator’s loss value to oscillate and negatively impacting the discriminator. Then, the negative feedback of discriminator is also not conducive to updating generator’s parameters, leading to suboptimal image generation quality. In response to this challenge, we present an innovative GANs model equipped with a learnable auxiliary module that processes auxiliary noise. The core objective of this module is to enhance the stability of both the generator and discriminator throughout the training process. To achieve this target, we incorporate a learnable auxiliary penalty and an augmented discriminator, designed to control the generator and reinforce the discriminator’s stability, respectively. We further apply our method to the Hinge and LSGANs loss functions, illustrating its efficacy in reducing the instability of both the generator and the discriminator. The tests we conducted on LSUN, CelebA, Market-1501 and Creative Senz3D datasets serve as proof of our method’s ability to improve the training stability and overall performance of the baseline methods.</p>","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":null,"pages":null},"PeriodicalIF":5.2000,"publicationDate":"2024-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Multimedia Computing Communications and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3653021","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Training generative adversarial networks (GANs) for noise-to-image synthesis is a challenge task, primarily due to the instability of GANs’ training process. One of the key issues is the generator’s sensitivity to input data, which can cause sudden fluctuations in the generator’s loss value with certain inputs. This sensitivity suggests an inadequate ability to resist disturbances in the generator, causing the discriminator’s loss value to oscillate and negatively impacting the discriminator. Then, the negative feedback of discriminator is also not conducive to updating generator’s parameters, leading to suboptimal image generation quality. In response to this challenge, we present an innovative GANs model equipped with a learnable auxiliary module that processes auxiliary noise. The core objective of this module is to enhance the stability of both the generator and discriminator throughout the training process. To achieve this target, we incorporate a learnable auxiliary penalty and an augmented discriminator, designed to control the generator and reinforce the discriminator’s stability, respectively. We further apply our method to the Hinge and LSGANs loss functions, illustrating its efficacy in reducing the instability of both the generator and the discriminator. The tests we conducted on LSUN, CelebA, Market-1501 and Creative Senz3D datasets serve as proof of our method’s ability to improve the training stability and overall performance of the baseline methods.

具有可学习辅助模块的生成式对抗网络用于图像合成
训练生成式对抗网络(GAN)进行噪声-图像合成是一项具有挑战性的任务,这主要是由于生成式对抗网络的训练过程具有不稳定性。其中一个关键问题是生成器对输入数据的敏感性,在某些输入情况下,生成器的损失值会突然波动。这种敏感性表明生成器抵御干扰的能力不足,从而导致鉴别器的损耗值波动,对鉴别器产生负面影响。然后,鉴别器的负反馈也不利于更新发生器的参数,导致图像生成质量不理想。为了应对这一挑战,我们提出了一种创新的 GANs 模型,该模型配备了一个可学习的辅助模块,用于处理辅助噪声。该模块的核心目标是在整个训练过程中增强生成器和鉴别器的稳定性。为了实现这一目标,我们加入了可学习辅助惩罚和增强判别器,分别用于控制生成器和增强判别器的稳定性。我们进一步将我们的方法应用于 Hinge 和 LSGANs 损失函数,说明它在降低生成器和判别器的不稳定性方面的功效。我们在 LSUN、CelebA、Market-1501 和 Creative Senz3D 数据集上进行的测试证明了我们的方法能够提高训练稳定性和基线方法的整体性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.50
自引率
5.90%
发文量
285
审稿时长
7.5 months
期刊介绍: The ACM Transactions on Multimedia Computing, Communications, and Applications is the flagship publication of the ACM Special Interest Group in Multimedia (SIGMM). It is soliciting paper submissions on all aspects of multimedia. Papers on single media (for instance, audio, video, animation) and their processing are also welcome. TOMM is a peer-reviewed, archival journal, available in both print form and digital form. The Journal is published quarterly; with roughly 7 23-page articles in each issue. In addition, all Special Issues are published online-only to ensure a timely publication. The transactions consists primarily of research papers. This is an archival journal and it is intended that the papers will have lasting importance and value over time. In general, papers whose primary focus is on particular multimedia products or the current state of the industry will not be included.
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信