An investigation on the Mandarin prosody of a parallel multi-speaking rate speech corpus

2009 Oriental COCOSDA International Conference on Speech Database and Assessments Pub Date : 2009-10-02 DOI:10.1109/ICSDA.2009.5278360

Chen-Yu Chiang, C. Tang, Hsiu-Min Yu, Yih-Ru Wang, Sin-Horng Chen

引用次数: 7

Abstract

In this paper, the prosody of a parallel multi-speaking rate Mandarin read speech corpus is investigated. The corpus contains four parallel speech datasets uttered by a female professional announcer with various speech rates (SRs) of 4.40 (fast), 3.82 (normal), 2.97 (median) and 2.45 (slow) syllables/second. By using the unsupervised joint prosody labeling and modeling (PLM) method proposed previously, the relationship between SR and various prosodic features, including pause duration, patterns of three high-level prosodic constituents, and the break labels, are investigated. The analyses reported in this study could be very informative in developing prosody generation mechanism for text-to-speech and prosody modeling for automatic speech recognition in various SRs.

查看原文本刊更多论文

平行多语速语料库的汉语韵律研究

本文对平行多语速汉语阅读语料库的韵律进行了研究。语料库包含4个由女性专业播音员发出的平行语音数据集，其语音速率(SRs)分别为4.40(快速)、3.82(正常)、2.97(中位数)和2.45(慢速)音节/秒。采用无监督联合韵律标注和建模(PLM)方法，研究了SR与各种韵律特征之间的关系，包括暂停时间、三个高级韵律成分的模式和中断标签。本研究的分析结果对于开发文本到语音的韵律生成机制和各种语音自动识别的韵律建模具有重要的参考价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 Oriental COCOSDA International Conference on Speech Database and Assessments

自引率

0.00%

发文量