极低比特率(VLBR)语音编码约500比特/秒

2004 12th European Signal Processing Conference Pub Date : 2004-09-06 DOI:10.5281/ZENODO.38454

M. Padellini, F. Capman, G. Baudoin

{"title":"极低比特率(VLBR)语音编码约500比特/秒","authors":"M. Padellini, F. Capman, G. Baudoin","doi":"10.5281/ZENODO.38454","DOIUrl":null,"url":null,"abstract":"New solutions to Very Low Bit Rate speech coding have been recently proposed based on speech recognition and speech synthesis technologies, [1,2,3,4,5,7,8]. In the continuation of the work described in [8], this paper presents a complete encoding scheme around 500 bits/sec. The proposed solution is based on automatic recognition of elementary acoustical units using HMM modelling. An unsupervised training phase is used to build the HMM models and the codebook of synthesis units. The decoded speech is then obtained by concatenating the corresponding synthesis units based on a HNM-like decomposition of speech. A new unit selection process is proposed integrating some prosody constraints. Through this approach, the size of the synthesis codebook is independent of the targeted bit rate. A complete description of the unit selection process and of the associated prosody modelling is given, together with the quantisation scheme of the overall set of encoded parameters.","PeriodicalId":347658,"journal":{"name":"2004 12th European Signal Processing Conference","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Very low bit rate (VLBR) speech coding around 500 bits/sec\",\"authors\":\"M. Padellini, F. Capman, G. Baudoin\",\"doi\":\"10.5281/ZENODO.38454\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"New solutions to Very Low Bit Rate speech coding have been recently proposed based on speech recognition and speech synthesis technologies, [1,2,3,4,5,7,8]. In the continuation of the work described in [8], this paper presents a complete encoding scheme around 500 bits/sec. The proposed solution is based on automatic recognition of elementary acoustical units using HMM modelling. An unsupervised training phase is used to build the HMM models and the codebook of synthesis units. The decoded speech is then obtained by concatenating the corresponding synthesis units based on a HNM-like decomposition of speech. A new unit selection process is proposed integrating some prosody constraints. Through this approach, the size of the synthesis codebook is independent of the targeted bit rate. A complete description of the unit selection process and of the associated prosody modelling is given, together with the quantisation scheme of the overall set of encoded parameters.\",\"PeriodicalId\":347658,\"journal\":{\"name\":\"2004 12th European Signal Processing Conference\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2004 12th European Signal Processing Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5281/ZENODO.38454\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2004 12th European Signal Processing Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5281/ZENODO.38454","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

最近提出了基于语音识别和语音合成技术的极低比特率语音编码的新解决方案[1,2,3,4,5,7,8]。在[8]的基础上，本文提出了一个500比特/秒左右的完整编码方案。该方法基于HMM模型对基本声学单元的自动识别。利用无监督训练阶段建立HMM模型和合成单元的码本。然后，基于类似hnm的语音分解，通过连接相应的合成单元来获得解码的语音。提出了一种结合韵律约束的单元选择方法。通过这种方法，合成码本的大小与目标比特率无关。给出了单元选择过程和相关韵律建模的完整描述，以及整个编码参数集的量化方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Very low bit rate (VLBR) speech coding around 500 bits/sec

New solutions to Very Low Bit Rate speech coding have been recently proposed based on speech recognition and speech synthesis technologies, [1,2,3,4,5,7,8]. In the continuation of the work described in [8], this paper presents a complete encoding scheme around 500 bits/sec. The proposed solution is based on automatic recognition of elementary acoustical units using HMM modelling. An unsupervised training phase is used to build the HMM models and the codebook of synthesis units. The decoded speech is then obtained by concatenating the corresponding synthesis units based on a HNM-like decomposition of speech. A new unit selection process is proposed integrating some prosody constraints. Through this approach, the size of the synthesis codebook is independent of the targeted bit rate. A complete description of the unit selection process and of the associated prosody modelling is given, together with the quantisation scheme of the overall set of encoded parameters.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2004 12th European Signal Processing Conference

自引率

0.00%

发文量