StudioGAN: A Taxonomy and Benchmark of GANs for Image Synthesis

IF 20.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Minguk Kang, Joonghyuk Shin, Jaesik Park
{"title":"StudioGAN: A Taxonomy and Benchmark of GANs for Image Synthesis","authors":"Minguk Kang, Joonghyuk Shin, Jaesik Park","doi":"10.48550/arXiv.2206.09479","DOIUrl":null,"url":null,"abstract":"Generative Adversarial Network (GAN) is one of the state-of-the-art generative models for realistic image synthesis. While training and evaluating GAN becomes increasingly important, the current GAN research ecosystem does not provide reliable benchmarks for which the evaluation is conducted consistently and fairly. Furthermore, because there are few validated GAN implementations, researchers devote considerable time to reproducing baselines. We study the taxonomy of GAN approaches and present a new open-source library named StudioGAN. StudioGAN supports 7 GAN architectures, 9 conditioning methods, 4 adversarial losses, 12 regularization modules, 3 differentiable augmentations, 7 evaluation metrics, and 5 evaluation backbones. With our training and evaluation protocol, we present a large-scale benchmark using various datasets (CIFAR10, ImageNet, AFHQv2, FFHQ, and Baby/Papa/Granpa-ImageNet) and 3 different evaluation backbones (InceptionV3, SwAV, and Swin Transformer). Unlike other benchmarks used in the GAN community, we train representative GANs, including BigGAN and StyleGAN series in a unified training pipeline and quantify generation performance with 7 evaluation metrics. The benchmark evaluates other cutting-edge generative models (e.g., StyleGAN-XL, ADM, MaskGIT, and RQ-Transformer). StudioGAN provides GAN implementations, training, and evaluation scripts with the pre-trained weights. StudioGAN is available at  https://github.com/POSTECH-CVLab/PyTorch-StudioGAN.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":" ","pages":""},"PeriodicalIF":20.8000,"publicationDate":"2022-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.48550/arXiv.2206.09479","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 28

Abstract

Generative Adversarial Network (GAN) is one of the state-of-the-art generative models for realistic image synthesis. While training and evaluating GAN becomes increasingly important, the current GAN research ecosystem does not provide reliable benchmarks for which the evaluation is conducted consistently and fairly. Furthermore, because there are few validated GAN implementations, researchers devote considerable time to reproducing baselines. We study the taxonomy of GAN approaches and present a new open-source library named StudioGAN. StudioGAN supports 7 GAN architectures, 9 conditioning methods, 4 adversarial losses, 12 regularization modules, 3 differentiable augmentations, 7 evaluation metrics, and 5 evaluation backbones. With our training and evaluation protocol, we present a large-scale benchmark using various datasets (CIFAR10, ImageNet, AFHQv2, FFHQ, and Baby/Papa/Granpa-ImageNet) and 3 different evaluation backbones (InceptionV3, SwAV, and Swin Transformer). Unlike other benchmarks used in the GAN community, we train representative GANs, including BigGAN and StyleGAN series in a unified training pipeline and quantify generation performance with 7 evaluation metrics. The benchmark evaluates other cutting-edge generative models (e.g., StyleGAN-XL, ADM, MaskGIT, and RQ-Transformer). StudioGAN provides GAN implementations, training, and evaluation scripts with the pre-trained weights. StudioGAN is available at  https://github.com/POSTECH-CVLab/PyTorch-StudioGAN.
StudioGAN:用于图像合成的gan的分类和基准
生成对抗性网络(GAN)是最先进的真实感图像合成生成模型之一。虽然训练和评估GAN变得越来越重要,但当前的GAN研究生态系统并没有提供可靠的基准,无法持续、公平地进行评估。此外,由于很少有经过验证的GAN实现,研究人员投入了大量时间来复制基线。我们研究了GAN方法的分类,并提出了一个新的开源库,名为StudioGAN。StudioGAN支持7种GAN架构、9种条件化方法、4种对抗性损失、12个正则化模块、3个可微扩充、7个评估度量和5个评估骨干。通过我们的训练和评估协议,我们使用各种数据集(CIFAR10、ImageNet、AFHQv2、FFHQ和Baby/Papa/Granpa ImageNet)和3个不同的评估骨干(InceptionV3、SwAV和Swin Transformer)提出了一个大规模的基准。与GAN社区中使用的其他基准不同,我们在统一的训练管道中训练具有代表性的GAN,包括BigGAN和StyleGAN系列,并用7个评估指标量化生成性能。该基准评估其他尖端生成模型(例如StyleGAN XL、ADM、MaskGIT和RQ Transformer)。StudioGAN为GAN的实现、训练和评估脚本提供了预先训练的权重。StudioGAN可在https://github.com/POSTECH-CVLab/PyTorch-StudioGAN.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
28.40
自引率
3.00%
发文量
885
审稿时长
8.5 months
期刊介绍: The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信