{"title":"Non-parallel and Many-to-One Musical Timbre Morphing using DDSP-Autoencoder and Spectral Feature Interpolation","authors":"Yi Zou, Jingyu Liu, Wei Jiang","doi":"10.1109/ICCST53801.2021.00040","DOIUrl":null,"url":null,"abstract":"Abstract - Timbre morphing is a signal processing technology that involves the use of an interpolation algorithm to gradually change the timbre of one instrument into that of another. However, the prepared target audio, which possesses the same music content (such as score and rhythm) as the original audio, is essential to the input used in the morphing technique. To meet the application requirements of non-parallel and many-to-one processing, we combined the signal processing technology of spectral feature interpolation with the deep learning technology of the autoencoder to provide the timbre morphing technology with a generalization ability. The complex framework, which focuses on the processing of harmonic information, is built using the pitch tracking, temporal signal processing, decoder, source-filter modeling, and spectral feature interpolation. Notably, unlike the neural vocoder that is commonly used in voice conversion, the decoder, which is trained using a data reconstruction method, is used to map the input audio features to the harmonic amplitude distribution with the target timbre. Experiments were conducted in which the clarinet timbre was morphed into a trumpet. An evaluation of the spectral features demonstrated the validity of the morphing technology.","PeriodicalId":222463,"journal":{"name":"2021 International Conference on Culture-oriented Science & Technology (ICCST)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Culture-oriented Science & Technology (ICCST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCST53801.2021.00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract - Timbre morphing is a signal processing technology that involves the use of an interpolation algorithm to gradually change the timbre of one instrument into that of another. However, the prepared target audio, which possesses the same music content (such as score and rhythm) as the original audio, is essential to the input used in the morphing technique. To meet the application requirements of non-parallel and many-to-one processing, we combined the signal processing technology of spectral feature interpolation with the deep learning technology of the autoencoder to provide the timbre morphing technology with a generalization ability. The complex framework, which focuses on the processing of harmonic information, is built using the pitch tracking, temporal signal processing, decoder, source-filter modeling, and spectral feature interpolation. Notably, unlike the neural vocoder that is commonly used in voice conversion, the decoder, which is trained using a data reconstruction method, is used to map the input audio features to the harmonic amplitude distribution with the target timbre. Experiments were conducted in which the clarinet timbre was morphed into a trumpet. An evaluation of the spectral features demonstrated the validity of the morphing technology.