Improved unit selection speech synthesis method utilizing subjective evaluation results on synthetic speech

2012 8th International Symposium on Chinese Spoken Language Processing Pub Date : 2012-12-01 DOI:10.1109/ISCSLP.2012.6423524

Xian-Jun Xia, Zhenhua Ling, Chen-Yu Yang, Lirong Dai

{"title":"Improved unit selection speech synthesis method utilizing subjective evaluation results on synthetic speech","authors":"Xian-Jun Xia, Zhenhua Ling, Chen-Yu Yang, Lirong Dai","doi":"10.1109/ISCSLP.2012.6423524","DOIUrl":null,"url":null,"abstract":"This paper presents an improved unit selection and waveform concatenation speech synthesis method by gathering and utilizing human feedbacks on synthetic speech. Firstly, a set of texts are synthesized by the baseline unit selection synthesis system. Each prosodic word within the synthetic speech is then evaluated as a natural one or an unnatural one by listeners. In our proposed method, these natural synthetic segments are treated as virtual candidate units to extend the original speech corpus for unit selection. A new speech synthesis system is constructed using this extended speech corpus. A synthetic error detector based on SVM classifier is also built using the natural and unnatural synthetic speech. At synthesis time, the input text is synthesized using the baseline system and the extended system simultaneously. The two unit selection results are evaluated by the trained synthetic error detector to determine the optimal one. Experimental results prove the effectiveness of our proposed method in improving the naturalness of synthetic speech on a task of synthesizing place names.","PeriodicalId":186099,"journal":{"name":"2012 8th International Symposium on Chinese Spoken Language Processing","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 8th International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCSLP.2012.6423524","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

This paper presents an improved unit selection and waveform concatenation speech synthesis method by gathering and utilizing human feedbacks on synthetic speech. Firstly, a set of texts are synthesized by the baseline unit selection synthesis system. Each prosodic word within the synthetic speech is then evaluated as a natural one or an unnatural one by listeners. In our proposed method, these natural synthetic segments are treated as virtual candidate units to extend the original speech corpus for unit selection. A new speech synthesis system is constructed using this extended speech corpus. A synthetic error detector based on SVM classifier is also built using the natural and unnatural synthetic speech. At synthesis time, the input text is synthesized using the baseline system and the extended system simultaneously. The two unit selection results are evaluated by the trained synthetic error detector to determine the optimal one. Experimental results prove the effectiveness of our proposed method in improving the naturalness of synthetic speech on a task of synthesizing place names.

查看原文本刊更多论文

基于合成语音主观评价结果的改进单元选择语音合成方法

本文提出了一种改进的单元选择和波形拼接语音合成方法，该方法通过收集和利用人对合成语音的反馈。首先，利用基线单元选择合成系统合成一组文本;然后，听者将合成语音中的每个韵律词评估为自然的或不自然的。在我们提出的方法中，将这些自然合成的片段作为虚拟候选单元来扩展原始语音语料库进行单元选择。利用该扩展语料库构建了一个新的语音合成系统。利用自然和非自然合成语音，构建了基于支持向量机分类器的合成错误检测器。在合成时，输入文本同时使用基线系统和扩展系统进行合成。通过训练后的综合误差检测器对两个单元的选择结果进行评估，以确定最优单元。实验结果证明了该方法在提高地名合成语音的自然度方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 8th International Symposium on Chinese Spoken Language Processing

自引率

0.00%

发文量