A novel WI decoder for the segmented frame decoding in the text-to-speech synthesizer

Kyungjin Byun, N. Eum, Heebum Jung
{"title":"A novel WI decoder for the segmented frame decoding in the text-to-speech synthesizer","authors":"Kyungjin Byun, N. Eum, Heebum Jung","doi":"10.5220/0002978901510154","DOIUrl":null,"url":null,"abstract":"The implementation of a high quality text-to-speech (TTS) requires huge storage space for a large number of speech segments, because current TTS synthesizers are mostly based on a technique known as synthesis by concatenation. In order to compress the database in the TTS system, the use of speech coders would be an efficient solution. Waveform interpolation (WI) has been shown to be an efficient speech coding algorithm to provide high quality speech at low bit rates. However, the speech coder used in a TTS system has to be different from the one used in communication applications because the decoder in the TTS system should have an ability to decode segmented frames. In this paper, we propose a novel WI decoder scheme that can handle the segmented frame decoding. The decoder can reconstruct a good quality speech even at the concatenation boundary, which is effective for the TTS system based on a synthesis by concatenation.","PeriodicalId":408116,"journal":{"name":"2010 International Conference on Signal Processing and Multimedia Applications (SIGMAP)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Signal Processing and Multimedia Applications (SIGMAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0002978901510154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The implementation of a high quality text-to-speech (TTS) requires huge storage space for a large number of speech segments, because current TTS synthesizers are mostly based on a technique known as synthesis by concatenation. In order to compress the database in the TTS system, the use of speech coders would be an efficient solution. Waveform interpolation (WI) has been shown to be an efficient speech coding algorithm to provide high quality speech at low bit rates. However, the speech coder used in a TTS system has to be different from the one used in communication applications because the decoder in the TTS system should have an ability to decode segmented frames. In this paper, we propose a novel WI decoder scheme that can handle the segmented frame decoding. The decoder can reconstruct a good quality speech even at the concatenation boundary, which is effective for the TTS system based on a synthesis by concatenation.
一种用于文本-语音合成器中分段帧解码的WI解码器
实现高质量的文本到语音(TTS)需要巨大的存储空间来存储大量的语音片段,因为当前的TTS合成器大多基于一种称为串联合成的技术。为了在TTS系统中压缩数据库,使用语音编码器是一种有效的解决方案。波形插值(WI)已被证明是一种有效的语音编码算法,可以在低比特率下提供高质量的语音。然而,TTS系统中使用的语音编码器必须与通信应用中使用的语音编码器不同,因为TTS系统中的解码器应该具有解码分段帧的能力。在本文中,我们提出了一种新的WI解码器方案,可以处理分割帧解码。该解码器即使在拼接边界处也能重构出高质量的语音,这对于基于拼接合成的TTS系统是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信