One Billion Audio Sounds from GPU-Enabled Modular Synthesis

Joseph P. Turian, Jordie Shier, G. Tzanetakis, K. McNally, Max Henry
{"title":"One Billion Audio Sounds from GPU-Enabled Modular Synthesis","authors":"Joseph P. Turian, Jordie Shier, G. Tzanetakis, K. McNally, Max Henry","doi":"10.23919/DAFx51585.2021.9768246","DOIUrl":null,"url":null,"abstract":"We release synth1B1, a multi-modal audio corpus consisting of 1 billion 4-second synthesized sounds, paired with the synthesis parameters used to generate them. The dataset is 100x larger than any audio dataset in the literature. We also introduce torchsynth, an open source modular synthesizer that generates the synth 1B1 samples on-the-fly at 16200x faster than real-time (714MHz) on a single GPU. Finally, we release two new audio datasets: FM synth timbre and subtractive synth pitch. Using these datasets, we demonstrate new rank-based evaluation criteria for existing audio representations. Finally, we propose a novel approach to synthesizer hyperparameter optimization.","PeriodicalId":221170,"journal":{"name":"2021 24th International Conference on Digital Audio Effects (DAFx)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 24th International Conference on Digital Audio Effects (DAFx)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/DAFx51585.2021.9768246","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

We release synth1B1, a multi-modal audio corpus consisting of 1 billion 4-second synthesized sounds, paired with the synthesis parameters used to generate them. The dataset is 100x larger than any audio dataset in the literature. We also introduce torchsynth, an open source modular synthesizer that generates the synth 1B1 samples on-the-fly at 16200x faster than real-time (714MHz) on a single GPU. Finally, we release two new audio datasets: FM synth timbre and subtractive synth pitch. Using these datasets, we demonstrate new rank-based evaluation criteria for existing audio representations. Finally, we propose a novel approach to synthesizer hyperparameter optimization.
十亿音频声音从gpu启用模块化合成
我们发布了synth1B1,这是一个多模态音频语料库,由10亿个4秒合成声音组成,并与用于生成它们的合成参数配对。该数据集比文献中任何音频数据集都大100倍。我们还介绍了torchsynth,一个开源的模块化合成器,在单个GPU上以比实时(714MHz)快16200倍的速度生成synth 1B1样本。最后,我们发布了两个新的音频数据集:FM合成器音色和减法合成器音高。使用这些数据集,我们展示了现有音频表示的新的基于排名的评估标准。最后,我们提出了一种新的合成器超参数优化方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信