{"title":"Research on voice conversion based codebook and GMM","authors":"Wei-chao Xie, Linghua Zhang","doi":"10.1109/ICCT.2010.5689017","DOIUrl":null,"url":null,"abstract":"Voice conversion (VC) is a technique used in order to change the personality characteristics of a source speaker's voice into the target speaker's, while preserving the original semantic information. This paper mainly studies a method of voice conversion with better quality by codebook. Firstly, personality parameters of both source and target speaker are time aligned to create the source and target codebook. Then compare the parameters which to be converted with the source codebook. If they are near enough, the corresponding parameters of target codebook are regarded as the converted parameters. Otherwise, we will use GMM to realize the conversion of LSF parameters and get residual excitation signal by pitch frequency estimated from converted LSF parameters. This method is better than the conversion only using GMM.","PeriodicalId":253478,"journal":{"name":"2010 IEEE 12th International Conference on Communication Technology","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 12th International Conference on Communication Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCT.2010.5689017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Voice conversion (VC) is a technique used in order to change the personality characteristics of a source speaker's voice into the target speaker's, while preserving the original semantic information. This paper mainly studies a method of voice conversion with better quality by codebook. Firstly, personality parameters of both source and target speaker are time aligned to create the source and target codebook. Then compare the parameters which to be converted with the source codebook. If they are near enough, the corresponding parameters of target codebook are regarded as the converted parameters. Otherwise, we will use GMM to realize the conversion of LSF parameters and get residual excitation signal by pitch frequency estimated from converted LSF parameters. This method is better than the conversion only using GMM.