{"title":"Differentiable Architecture Search With Attention Mechanisms for Generative Adversarial Networks","authors":"Yu Xue;Kun Chen;Ferrante Neri","doi":"10.1109/TETCI.2024.3369998","DOIUrl":null,"url":null,"abstract":"Generative adversarial networks (GANs) are machine learning algorithms that can efficiently generate data such as images. Although GANs are very popular, their training usually lacks stability, with the generator and discriminator networks failing to converge during the training process. To address this problem and improve the stability of GANs, in this paper, we automate the design of stable GANs architectures through a novel approach: differentiable architecture search with attention mechanisms for generative adversarial networks (\n<bold>DAMGAN</b>\n). We construct a generator supernet and search for the optimal generator network within it. We propose incorporating two attention mechanisms between each pair of nodes in the supernet. The first attention mechanism, down attention, selects the optimal candidate operation of each edge in the supernet, while the second attention mechanism, up attention, improves the training stability of the supernet and limits the computational cost of the search by selecting the most important feature maps for the following candidate operations. Experimental results show that the architectures searched by our method obtain a state-of-the-art inception score (IS) of 8.99 and a very competitive Fréchet inception distance (FID) of 10.27 on the CIFAR-10 dataset. Competitive results were also obtained on the STL-10 dataset (IS = 10.35, FID = 22.18). Notably, our search time was only 0.09 GPU days.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 4","pages":"3141-3151"},"PeriodicalIF":5.3000,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10477508/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Generative adversarial networks (GANs) are machine learning algorithms that can efficiently generate data such as images. Although GANs are very popular, their training usually lacks stability, with the generator and discriminator networks failing to converge during the training process. To address this problem and improve the stability of GANs, in this paper, we automate the design of stable GANs architectures through a novel approach: differentiable architecture search with attention mechanisms for generative adversarial networks (
DAMGAN
). We construct a generator supernet and search for the optimal generator network within it. We propose incorporating two attention mechanisms between each pair of nodes in the supernet. The first attention mechanism, down attention, selects the optimal candidate operation of each edge in the supernet, while the second attention mechanism, up attention, improves the training stability of the supernet and limits the computational cost of the search by selecting the most important feature maps for the following candidate operations. Experimental results show that the architectures searched by our method obtain a state-of-the-art inception score (IS) of 8.99 and a very competitive Fréchet inception distance (FID) of 10.27 on the CIFAR-10 dataset. Competitive results were also obtained on the STL-10 dataset (IS = 10.35, FID = 22.18). Notably, our search time was only 0.09 GPU days.
期刊介绍:
The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys.
TETCI is an electronics only publication. TETCI publishes six issues per year.
Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.