使用ddsp -自动编码器和频谱特征插值的非并行和多对一音乐音色变形

Yi Zou, Jingyu Liu, Wei Jiang
{"title":"使用ddsp -自动编码器和频谱特征插值的非并行和多对一音乐音色变形","authors":"Yi Zou, Jingyu Liu, Wei Jiang","doi":"10.1109/ICCST53801.2021.00040","DOIUrl":null,"url":null,"abstract":"Abstract - Timbre morphing is a signal processing technology that involves the use of an interpolation algorithm to gradually change the timbre of one instrument into that of another. However, the prepared target audio, which possesses the same music content (such as score and rhythm) as the original audio, is essential to the input used in the morphing technique. To meet the application requirements of non-parallel and many-to-one processing, we combined the signal processing technology of spectral feature interpolation with the deep learning technology of the autoencoder to provide the timbre morphing technology with a generalization ability. The complex framework, which focuses on the processing of harmonic information, is built using the pitch tracking, temporal signal processing, decoder, source-filter modeling, and spectral feature interpolation. Notably, unlike the neural vocoder that is commonly used in voice conversion, the decoder, which is trained using a data reconstruction method, is used to map the input audio features to the harmonic amplitude distribution with the target timbre. Experiments were conducted in which the clarinet timbre was morphed into a trumpet. An evaluation of the spectral features demonstrated the validity of the morphing technology.","PeriodicalId":222463,"journal":{"name":"2021 International Conference on Culture-oriented Science & Technology (ICCST)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Non-parallel and Many-to-One Musical Timbre Morphing using DDSP-Autoencoder and Spectral Feature Interpolation\",\"authors\":\"Yi Zou, Jingyu Liu, Wei Jiang\",\"doi\":\"10.1109/ICCST53801.2021.00040\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract - Timbre morphing is a signal processing technology that involves the use of an interpolation algorithm to gradually change the timbre of one instrument into that of another. However, the prepared target audio, which possesses the same music content (such as score and rhythm) as the original audio, is essential to the input used in the morphing technique. To meet the application requirements of non-parallel and many-to-one processing, we combined the signal processing technology of spectral feature interpolation with the deep learning technology of the autoencoder to provide the timbre morphing technology with a generalization ability. The complex framework, which focuses on the processing of harmonic information, is built using the pitch tracking, temporal signal processing, decoder, source-filter modeling, and spectral feature interpolation. Notably, unlike the neural vocoder that is commonly used in voice conversion, the decoder, which is trained using a data reconstruction method, is used to map the input audio features to the harmonic amplitude distribution with the target timbre. Experiments were conducted in which the clarinet timbre was morphed into a trumpet. An evaluation of the spectral features demonstrated the validity of the morphing technology.\",\"PeriodicalId\":222463,\"journal\":{\"name\":\"2021 International Conference on Culture-oriented Science & Technology (ICCST)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Culture-oriented Science & Technology (ICCST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCST53801.2021.00040\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Culture-oriented Science & Technology (ICCST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCST53801.2021.00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

音色变形是一种信号处理技术,它涉及到使用插值算法将一种乐器的音色逐渐改变为另一种乐器的音色。然而,准备好的目标音频具有与原始音频相同的音乐内容(如乐谱和节奏),对于变形技术中使用的输入是必不可少的。为了满足非并行和多对一处理的应用需求,我们将频谱特征插值的信号处理技术与自编码器的深度学习技术相结合,使音色变形技术具有泛化能力。利用基音跟踪、时间信号处理、解码器、源-滤波器建模和频谱特征插值等技术,构建了以谐波信息处理为重点的复杂框架。值得注意的是,与语音转换中常用的神经声码器不同,使用数据重建方法训练的解码器用于将输入音频特征映射到目标音色的谐波幅度分布。实验中,单簧管的音色变成了小号的音色。对光谱特征的评价证明了该变形技术的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Non-parallel and Many-to-One Musical Timbre Morphing using DDSP-Autoencoder and Spectral Feature Interpolation
Abstract - Timbre morphing is a signal processing technology that involves the use of an interpolation algorithm to gradually change the timbre of one instrument into that of another. However, the prepared target audio, which possesses the same music content (such as score and rhythm) as the original audio, is essential to the input used in the morphing technique. To meet the application requirements of non-parallel and many-to-one processing, we combined the signal processing technology of spectral feature interpolation with the deep learning technology of the autoencoder to provide the timbre morphing technology with a generalization ability. The complex framework, which focuses on the processing of harmonic information, is built using the pitch tracking, temporal signal processing, decoder, source-filter modeling, and spectral feature interpolation. Notably, unlike the neural vocoder that is commonly used in voice conversion, the decoder, which is trained using a data reconstruction method, is used to map the input audio features to the harmonic amplitude distribution with the target timbre. Experiments were conducted in which the clarinet timbre was morphed into a trumpet. An evaluation of the spectral features demonstrated the validity of the morphing technology.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信