High Quality Voice Conversion through Phoneme-Based Linear Mapping Functions with STRAIGHT for Mandarin

Kun Liu, Jianping Zhang, Yonghong Yan
{"title":"High Quality Voice Conversion through Phoneme-Based Linear Mapping Functions with STRAIGHT for Mandarin","authors":"Kun Liu, Jianping Zhang, Yonghong Yan","doi":"10.1109/FSKD.2007.347","DOIUrl":null,"url":null,"abstract":"A novel voice conversion system using phoneme-based linear mapping functions on main vowel phonemes is proposed in this paper. Our voice conversion algorithm has the following three improvements. First, instead of using all the vocal tract resonance (VTR) vectors in the portion of a phoneme, we use the VTR vector at the steady-state of each phoneme to train phoneme-based GMM. Second, different linear mapping functions have been trained to describe the mapping relationships for corresponding phonemes. Third, in the transformation procedure, the transformed formant frequencies at the main vowel phonemes are obtained using the corresponding GMM. Besides, prosody parameters are also transformed. Finally the converted speech is re-synthesized with the transformed parameters by high quality speech manipulation framework STRAIGHT (Speech Transformation and Representation based on Adaptive Interpolation of weiGHTed spectrogram). Perceptual results for F-M and M-F conversion show that our MOS score of the converted voice is improved from 3.8 to 4.1 and ABX score from 3.3 to 3.8 compared with IBM's system. Comparisons with other systems are also given in this paper.","PeriodicalId":201883,"journal":{"name":"Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007)","volume":"56 S7","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"70","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FSKD.2007.347","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 70

Abstract

A novel voice conversion system using phoneme-based linear mapping functions on main vowel phonemes is proposed in this paper. Our voice conversion algorithm has the following three improvements. First, instead of using all the vocal tract resonance (VTR) vectors in the portion of a phoneme, we use the VTR vector at the steady-state of each phoneme to train phoneme-based GMM. Second, different linear mapping functions have been trained to describe the mapping relationships for corresponding phonemes. Third, in the transformation procedure, the transformed formant frequencies at the main vowel phonemes are obtained using the corresponding GMM. Besides, prosody parameters are also transformed. Finally the converted speech is re-synthesized with the transformed parameters by high quality speech manipulation framework STRAIGHT (Speech Transformation and Representation based on Adaptive Interpolation of weiGHTed spectrogram). Perceptual results for F-M and M-F conversion show that our MOS score of the converted voice is improved from 3.8 to 4.1 and ABX score from 3.3 to 3.8 compared with IBM's system. Comparisons with other systems are also given in this paper.
通过基于音素的线性映射函数与普通话的STRAIGHT实现高质量的语音转换
本文提出了一种基于主元音音素的线性映射函数的语音转换系统。我们的语音转换算法有以下三个改进。首先,我们不是使用音素部分的所有声道共振(VTR)向量,而是使用每个音素稳态的VTR向量来训练基于音素的GMM。其次,训练不同的线性映射函数来描述对应音素的映射关系。第三,在变换过程中,利用相应的GMM得到主元音音素处变换后的形成峰频率。此外,韵律参数也进行了转换。最后,利用高质量的语音处理框架STRAIGHT(基于加权谱图自适应插值的语音变换与表示)将转换后的语音与转换后的参数重新合成。F-M和M-F转换的感知结果表明,与IBM的系统相比,我们的转换语音的MOS评分从3.8提高到4.1,ABX评分从3.3提高到3.8。并与其他系统进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信