{"title":"An embedded English synthesis approach based on speech concatenation and smoothing","authors":"Guilin Chen, Dongjian Yue, Yiqing Zu, Zhenli Yu","doi":"10.1109/CHINSL.2004.1409610","DOIUrl":null,"url":null,"abstract":"An embedded English synthesis approach based on speech concatenation and smoothing is described. This approach adopts phonetic sub-words as carriers of variable-length units. We define 5-class units to cover all English phonetic phenomena. The corresponding cost function and search procedure based on dynamic programming are addressed in the unit-selection stage. Vocal tract response, pitch value and phase are interpolated and merged at concatenating points for smoothing speech in the synthesis stage. Preliminary tests show that this approach can reach a good balance of naturalness, intelligibility and data footprint.","PeriodicalId":212562,"journal":{"name":"2004 International Symposium on Chinese Spoken Language Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2004 International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CHINSL.2004.1409610","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
An embedded English synthesis approach based on speech concatenation and smoothing is described. This approach adopts phonetic sub-words as carriers of variable-length units. We define 5-class units to cover all English phonetic phenomena. The corresponding cost function and search procedure based on dynamic programming are addressed in the unit-selection stage. Vocal tract response, pitch value and phase are interpolated and merged at concatenating points for smoothing speech in the synthesis stage. Preliminary tests show that this approach can reach a good balance of naturalness, intelligibility and data footprint.