{"title":"A novel WI decoder for the segmented frame decoding in the text-to-speech synthesizer","authors":"Kyungjin Byun, N. Eum, Heebum Jung","doi":"10.5220/0002978901510154","DOIUrl":null,"url":null,"abstract":"The implementation of a high quality text-to-speech (TTS) requires huge storage space for a large number of speech segments, because current TTS synthesizers are mostly based on a technique known as synthesis by concatenation. In order to compress the database in the TTS system, the use of speech coders would be an efficient solution. Waveform interpolation (WI) has been shown to be an efficient speech coding algorithm to provide high quality speech at low bit rates. However, the speech coder used in a TTS system has to be different from the one used in communication applications because the decoder in the TTS system should have an ability to decode segmented frames. In this paper, we propose a novel WI decoder scheme that can handle the segmented frame decoding. The decoder can reconstruct a good quality speech even at the concatenation boundary, which is effective for the TTS system based on a synthesis by concatenation.","PeriodicalId":408116,"journal":{"name":"2010 International Conference on Signal Processing and Multimedia Applications (SIGMAP)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Signal Processing and Multimedia Applications (SIGMAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0002978901510154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The implementation of a high quality text-to-speech (TTS) requires huge storage space for a large number of speech segments, because current TTS synthesizers are mostly based on a technique known as synthesis by concatenation. In order to compress the database in the TTS system, the use of speech coders would be an efficient solution. Waveform interpolation (WI) has been shown to be an efficient speech coding algorithm to provide high quality speech at low bit rates. However, the speech coder used in a TTS system has to be different from the one used in communication applications because the decoder in the TTS system should have an ability to decode segmented frames. In this paper, we propose a novel WI decoder scheme that can handle the segmented frame decoding. The decoder can reconstruct a good quality speech even at the concatenation boundary, which is effective for the TTS system based on a synthesis by concatenation.