Exploiting Pre-trained Feature Networks for Generative Adversarial Networks in Audio-domain Loop Generation

Yen-Tung Yeh, Bo-Yu Chen, Yi-Hsuan Yang
{"title":"Exploiting Pre-trained Feature Networks for Generative Adversarial Networks in Audio-domain Loop Generation","authors":"Yen-Tung Yeh, Bo-Yu Chen, Yi-Hsuan Yang","doi":"10.48550/arXiv.2209.01751","DOIUrl":null,"url":null,"abstract":"While generative adversarial networks (GANs) have been widely used in research on audio generation, the training of a GAN model is known to be unstable, time consuming, and data inefficient. Among the attempts to ameliorate the training process of GANs, the idea of Projected GAN emerges as an effective solution for GAN-based image generation, establishing the state-of-the-art in different image applications. The core idea is to use a pre-trained classifier to constrain the feature space of the discriminator to stabilize and improve GAN training. This paper investigates whether Projected GAN can similarly improve audio generation, by evaluating the performance of a StyleGAN2-based audio-domain loop generation model with and without using a pre-trained feature space in the discriminator. Moreover, we compare the performance of using a general versus domain-specific classifier as the pre-trained audio classifier. With experiments on both drum loop and synth loop generation, we show that a general audio classifier works better, and that with Projected GAN our loop generation models can converge around 5 times faster without performance degradation.","PeriodicalId":309903,"journal":{"name":"International Society for Music Information Retrieval Conference","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Society for Music Information Retrieval Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2209.01751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

While generative adversarial networks (GANs) have been widely used in research on audio generation, the training of a GAN model is known to be unstable, time consuming, and data inefficient. Among the attempts to ameliorate the training process of GANs, the idea of Projected GAN emerges as an effective solution for GAN-based image generation, establishing the state-of-the-art in different image applications. The core idea is to use a pre-trained classifier to constrain the feature space of the discriminator to stabilize and improve GAN training. This paper investigates whether Projected GAN can similarly improve audio generation, by evaluating the performance of a StyleGAN2-based audio-domain loop generation model with and without using a pre-trained feature space in the discriminator. Moreover, we compare the performance of using a general versus domain-specific classifier as the pre-trained audio classifier. With experiments on both drum loop and synth loop generation, we show that a general audio classifier works better, and that with Projected GAN our loop generation models can converge around 5 times faster without performance degradation.
在音频域环路生成中利用预训练特征网络生成对抗网络
虽然生成对抗网络(GAN)已广泛应用于音频生成的研究,但已知GAN模型的训练不稳定、耗时和数据效率低下。在改进GAN训练过程的尝试中,投影GAN的思想作为基于GAN的图像生成的有效解决方案出现,在不同的图像应用中建立了最新的技术。其核心思想是使用预训练的分类器来约束鉴别器的特征空间,以稳定和改进GAN训练。本文通过评估基于stylegan2的音频域环路生成模型的性能,在鉴别器中使用预训练的特征空间和不使用预训练的特征空间,研究了投影GAN是否可以类似地改善音频生成。此外,我们比较了使用通用分类器和特定领域分类器作为预训练音频分类器的性能。通过鼓环和synth环路生成的实验,我们证明了一般音频分类器的工作效果更好,并且使用投影GAN,我们的环路生成模型可以在不降低性能的情况下收敛大约5倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信