Differentiable Architecture Search With Attention Mechanisms for Generative Adversarial Networks

IF 5.3 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yu Xue;Kun Chen;Ferrante Neri
{"title":"Differentiable Architecture Search With Attention Mechanisms for Generative Adversarial Networks","authors":"Yu Xue;Kun Chen;Ferrante Neri","doi":"10.1109/TETCI.2024.3369998","DOIUrl":null,"url":null,"abstract":"Generative adversarial networks (GANs) are machine learning algorithms that can efficiently generate data such as images. Although GANs are very popular, their training usually lacks stability, with the generator and discriminator networks failing to converge during the training process. To address this problem and improve the stability of GANs, in this paper, we automate the design of stable GANs architectures through a novel approach: differentiable architecture search with attention mechanisms for generative adversarial networks (\n<bold>DAMGAN</b>\n). We construct a generator supernet and search for the optimal generator network within it. We propose incorporating two attention mechanisms between each pair of nodes in the supernet. The first attention mechanism, down attention, selects the optimal candidate operation of each edge in the supernet, while the second attention mechanism, up attention, improves the training stability of the supernet and limits the computational cost of the search by selecting the most important feature maps for the following candidate operations. Experimental results show that the architectures searched by our method obtain a state-of-the-art inception score (IS) of 8.99 and a very competitive Fréchet inception distance (FID) of 10.27 on the CIFAR-10 dataset. Competitive results were also obtained on the STL-10 dataset (IS = 10.35, FID = 22.18). Notably, our search time was only 0.09 GPU days.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 4","pages":"3141-3151"},"PeriodicalIF":5.3000,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10477508/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Generative adversarial networks (GANs) are machine learning algorithms that can efficiently generate data such as images. Although GANs are very popular, their training usually lacks stability, with the generator and discriminator networks failing to converge during the training process. To address this problem and improve the stability of GANs, in this paper, we automate the design of stable GANs architectures through a novel approach: differentiable architecture search with attention mechanisms for generative adversarial networks ( DAMGAN ). We construct a generator supernet and search for the optimal generator network within it. We propose incorporating two attention mechanisms between each pair of nodes in the supernet. The first attention mechanism, down attention, selects the optimal candidate operation of each edge in the supernet, while the second attention mechanism, up attention, improves the training stability of the supernet and limits the computational cost of the search by selecting the most important feature maps for the following candidate operations. Experimental results show that the architectures searched by our method obtain a state-of-the-art inception score (IS) of 8.99 and a very competitive Fréchet inception distance (FID) of 10.27 on the CIFAR-10 dataset. Competitive results were also obtained on the STL-10 dataset (IS = 10.35, FID = 22.18). Notably, our search time was only 0.09 GPU days.
针对生成式对抗网络的带有注意机制的可变架构搜索
生成式对抗网络(GANs)是一种机器学习算法,可以有效生成图像等数据。虽然 GANs 非常流行,但其训练通常缺乏稳定性,生成器和判别器网络在训练过程中无法收敛。为了解决这个问题并提高 GANs 的稳定性,我们在本文中通过一种新方法自动设计稳定的 GANs 架构:带有生成对抗网络注意机制的可微分架构搜索(DAMGAN)。我们构建了一个生成器超网络,并在其中搜索最佳生成器网络。我们建议在超网络的每对节点之间加入两种关注机制。第一种关注机制,即向下关注,选择超网上每条边的最优候选操作;第二种关注机制,即向上关注,通过为后续候选操作选择最重要的特征图,提高超网的训练稳定性并限制搜索的计算成本。实验结果表明,在 CIFAR-10 数据集上,用我们的方法搜索到的架构获得了 8.99 分的一流起始分(IS)和 10.27 分的极具竞争力的弗雷谢特起始距离(FID)。在 STL-10 数据集上也取得了具有竞争力的结果(IS = 10.35,FID = 22.18)。值得注意的是,我们的搜索时间仅为 0.09 GPU 天。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
10.30
自引率
7.50%
发文量
147
期刊介绍: The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys. TETCI is an electronics only publication. TETCI publishes six issues per year. Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信