Modeling segmental duration in German text-to-speech synthesis

Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI:10.21437/ICSLP.1996-601

Bernd Möbius, J. V. Santen

引用次数: 56

Abstract

The paper reports on the construction of a model for segmental duration in German. The model predicts the durations of speech sounds in various textual, prosodic, and segmental contexts. It has been implemented in the German version of the Bell Labs text to speech system (R. Sproat and J. Olive, 1995; B. Mobius et al., 1996). The construction of the duration system was made efficient by the use of an interactive statistical analysis package that incorporates the approach outlined by J.P.H. van Santen (1994). The results an stored in tables in a format that can be directly interpreted by the TTS duration module. Tables are constructed in two phases: inferential statistical analysis of the speech corpus, and parameter estimation. The overall correlation between observed and predicted segmental durations is .896.

查看原文本刊更多论文

德语文本-语音合成中片段持续时间的建模

本文报道了德语分段持续时间模型的构建。该模型预测语音在各种文本、韵律和片段上下文中的持续时间。它已经在贝尔实验室德文版本的文本到语音系统中实现(R. spproat和J. Olive, 1995;B. Mobius et al.， 1996)。由于使用了一个交互式统计分析包，其中包含了J.P.H. van Santen(1994年)概述的方法，因此使持续时间系统的构建变得有效。结果以一种可以由TTS duration模块直接解释的格式存储在表中。表的构建分为两个阶段:语料库的推理统计分析和参数估计。观测到的段持续时间与预测的段持续时间的总体相关性为0.896。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings : ICSLP. International Conference on Spoken Language Processing

自引率

0.00%

发文量