{"title":"德语文本-语音合成中片段持续时间的建模","authors":"Bernd Möbius, J. V. Santen","doi":"10.21437/ICSLP.1996-601","DOIUrl":null,"url":null,"abstract":"The paper reports on the construction of a model for segmental duration in German. The model predicts the durations of speech sounds in various textual, prosodic, and segmental contexts. It has been implemented in the German version of the Bell Labs text to speech system (R. Sproat and J. Olive, 1995; B. Mobius et al., 1996). The construction of the duration system was made efficient by the use of an interactive statistical analysis package that incorporates the approach outlined by J.P.H. van Santen (1994). The results an stored in tables in a format that can be directly interpreted by the TTS duration module. Tables are constructed in two phases: inferential statistical analysis of the speech corpus, and parameter estimation. The overall correlation between observed and predicted segmental durations is .896.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"22 1","pages":"2395-2398"},"PeriodicalIF":0.0000,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"56","resultStr":"{\"title\":\"Modeling segmental duration in German text-to-speech synthesis\",\"authors\":\"Bernd Möbius, J. V. Santen\",\"doi\":\"10.21437/ICSLP.1996-601\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper reports on the construction of a model for segmental duration in German. The model predicts the durations of speech sounds in various textual, prosodic, and segmental contexts. It has been implemented in the German version of the Bell Labs text to speech system (R. Sproat and J. Olive, 1995; B. Mobius et al., 1996). The construction of the duration system was made efficient by the use of an interactive statistical analysis package that incorporates the approach outlined by J.P.H. van Santen (1994). The results an stored in tables in a format that can be directly interpreted by the TTS duration module. Tables are constructed in two phases: inferential statistical analysis of the speech corpus, and parameter estimation. The overall correlation between observed and predicted segmental durations is .896.\",\"PeriodicalId\":90685,\"journal\":{\"name\":\"Proceedings : ICSLP. International Conference on Spoken Language Processing\",\"volume\":\"22 1\",\"pages\":\"2395-2398\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1996-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"56\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings : ICSLP. International Conference on Spoken Language Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21437/ICSLP.1996-601\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings : ICSLP. International Conference on Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/ICSLP.1996-601","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 56
摘要
本文报道了德语分段持续时间模型的构建。该模型预测语音在各种文本、韵律和片段上下文中的持续时间。它已经在贝尔实验室德文版本的文本到语音系统中实现(R. spproat和J. Olive, 1995;B. Mobius et al., 1996)。由于使用了一个交互式统计分析包,其中包含了J.P.H. van Santen(1994年)概述的方法,因此使持续时间系统的构建变得有效。结果以一种可以由TTS duration模块直接解释的格式存储在表中。表的构建分为两个阶段:语料库的推理统计分析和参数估计。观测到的段持续时间与预测的段持续时间的总体相关性为0.896。
Modeling segmental duration in German text-to-speech synthesis
The paper reports on the construction of a model for segmental duration in German. The model predicts the durations of speech sounds in various textual, prosodic, and segmental contexts. It has been implemented in the German version of the Bell Labs text to speech system (R. Sproat and J. Olive, 1995; B. Mobius et al., 1996). The construction of the duration system was made efficient by the use of an interactive statistical analysis package that incorporates the approach outlined by J.P.H. van Santen (1994). The results an stored in tables in a format that can be directly interpreted by the TTS duration module. Tables are constructed in two phases: inferential statistical analysis of the speech corpus, and parameter estimation. The overall correlation between observed and predicted segmental durations is .896.