A novel WI decoder for the segmented frame decoding in the text-to-speech synthesizer

2010 International Conference on Signal Processing and Multimedia Applications (SIGMAP) Pub Date : 2010-07-26 DOI:10.5220/0002978901510154

Kyungjin Byun, N. Eum, Heebum Jung

引用次数: 0

Abstract

The implementation of a high quality text-to-speech (TTS) requires huge storage space for a large number of speech segments, because current TTS synthesizers are mostly based on a technique known as synthesis by concatenation. In order to compress the database in the TTS system, the use of speech coders would be an efficient solution. Waveform interpolation (WI) has been shown to be an efficient speech coding algorithm to provide high quality speech at low bit rates. However, the speech coder used in a TTS system has to be different from the one used in communication applications because the decoder in the TTS system should have an ability to decode segmented frames. In this paper, we propose a novel WI decoder scheme that can handle the segmented frame decoding. The decoder can reconstruct a good quality speech even at the concatenation boundary, which is effective for the TTS system based on a synthesis by concatenation.

查看原文本刊更多论文

一种用于文本-语音合成器中分段帧解码的WI解码器

实现高质量的文本到语音(TTS)需要巨大的存储空间来存储大量的语音片段，因为当前的TTS合成器大多基于一种称为串联合成的技术。为了在TTS系统中压缩数据库，使用语音编码器是一种有效的解决方案。波形插值(WI)已被证明是一种有效的语音编码算法，可以在低比特率下提供高质量的语音。然而，TTS系统中使用的语音编码器必须与通信应用中使用的语音编码器不同，因为TTS系统中的解码器应该具有解码分段帧的能力。在本文中，我们提出了一种新的WI解码器方案，可以处理分割帧解码。该解码器即使在拼接边界处也能重构出高质量的语音，这对于基于拼接合成的TTS系统是有效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 International Conference on Signal Processing and Multimedia Applications (SIGMAP)

自引率

0.00%

发文量