{"title":"High quality voice conversion by post-filtering the outputs of Gaussian processes","authors":"N. Xu, Xiao Yao, A. Jiang, Xiaofeng Liu, J. Bao","doi":"10.1109/EUSIPCO.2016.7760371","DOIUrl":null,"url":null,"abstract":"Voice conversion is a technique that aims to transform the individuality of source speech so as to mimic that of target speech while keeping the message unaltered, where the Gaussian mixture model based methods are most commonly used. However, these methods suffer from over-smoothing and over-fitting problems. In our previous work, we proposed to use Gaussian processes to alleviate over-fitting. Despite its effectiveness, this method will inevitably lead to over-smoothing due to choosing the mean of predictive distribution of Gaussian processes as optimal estimation. Thus, in this paper we focus on addressing the over-smoothing problem by post-filtering the outputs of the standard Gaussian processes, resulting in more dynamics in the converted feature parameters. Experiments have confirmed the validity of the proposed method both objectively and subjectively.","PeriodicalId":127068,"journal":{"name":"2016 24th European Signal Processing Conference (EUSIPCO)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 24th European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUSIPCO.2016.7760371","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Voice conversion is a technique that aims to transform the individuality of source speech so as to mimic that of target speech while keeping the message unaltered, where the Gaussian mixture model based methods are most commonly used. However, these methods suffer from over-smoothing and over-fitting problems. In our previous work, we proposed to use Gaussian processes to alleviate over-fitting. Despite its effectiveness, this method will inevitably lead to over-smoothing due to choosing the mean of predictive distribution of Gaussian processes as optimal estimation. Thus, in this paper we focus on addressing the over-smoothing problem by post-filtering the outputs of the standard Gaussian processes, resulting in more dynamics in the converted feature parameters. Experiments have confirmed the validity of the proposed method both objectively and subjectively.