基于Haar时变模型的语音短期预测方法

2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011) Pub Date : 2011-11-10 DOI:10.1109/ICCCT.2011.6075174

Deepak Joy, R. Kumar, S. Pathak

{"title":"基于Haar时变模型的语音短期预测方法","authors":"Deepak Joy, R. Kumar, S. Pathak","doi":"10.1109/ICCCT.2011.6075174","DOIUrl":null,"url":null,"abstract":"Speech is a non-stationary signal. The non-stationarity of the speech arises from emotional variations, speaker and environment variations. Almost all of the speech coding standards available today rely on stationary models for the modelling of time varying parameters of the speech generation model which affects the perceptional quality of the coded speech. Physically non-stationarity can be interpreted as the manifestation of the time varying nature of the speech generation source-the vocal tract. The vocal tract can be roughly modelled as a Autoregressive (AR) filter(all pole model). The time varying nature of the vocal tract corresponds to the time-varying AR parameters. The time varying AR parameters are expressed as the sum of Haar wavelets. Thus the estimation of time varying AR parameters boils down to that of finding the time invariant coefficients of the Haar wavelet basis functions. Here we propose a variable bit rate codec which attempts to bring in the non stationary modelling of the time varying AR parameters of the speech. Further long term prediction(LTP) and AbS method can be incorporated to develop a codec using this short term prediction method. It can be seen that the Haar wavelet based speech coding method over-performs the traditional method.","PeriodicalId":285986,"journal":{"name":"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel short term prediction method for speech using Haar based time varying models\",\"authors\":\"Deepak Joy, R. Kumar, S. Pathak\",\"doi\":\"10.1109/ICCCT.2011.6075174\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech is a non-stationary signal. The non-stationarity of the speech arises from emotional variations, speaker and environment variations. Almost all of the speech coding standards available today rely on stationary models for the modelling of time varying parameters of the speech generation model which affects the perceptional quality of the coded speech. Physically non-stationarity can be interpreted as the manifestation of the time varying nature of the speech generation source-the vocal tract. The vocal tract can be roughly modelled as a Autoregressive (AR) filter(all pole model). The time varying nature of the vocal tract corresponds to the time-varying AR parameters. The time varying AR parameters are expressed as the sum of Haar wavelets. Thus the estimation of time varying AR parameters boils down to that of finding the time invariant coefficients of the Haar wavelet basis functions. Here we propose a variable bit rate codec which attempts to bring in the non stationary modelling of the time varying AR parameters of the speech. Further long term prediction(LTP) and AbS method can be incorporated to develop a codec using this short term prediction method. It can be seen that the Haar wavelet based speech coding method over-performs the traditional method.\",\"PeriodicalId\":285986,\"journal\":{\"name\":\"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCCT.2011.6075174\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCT.2011.6075174","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

语音是一种非平稳信号。言语的非平稳性源于情绪的变化、说话者的变化和环境的变化。目前几乎所有可用的语音编码标准都依赖于平稳模型来建模语音生成模型的时变参数，这些参数会影响编码语音的感知质量。物理上的非平稳性可以解释为语音产生源-声道的时变性质的表现。声道可以粗略地建模为一个自回归(AR)滤波器(全极模型)。声道的时变性质与时变的AR参数相对应。时变的AR参数表示为Haar小波的和。因此，时变AR参数的估计可以归结为寻找Haar小波基函数的时不变系数。在这里，我们提出了一个可变比特率编解码器，它试图引入语音时变AR参数的非平稳建模。进一步的长期预测(LTP)和AbS方法可以结合使用这种短期预测方法开发编解码器。可以看出，基于Haar小波的语音编码方法优于传统的语音编码方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A novel short term prediction method for speech using Haar based time varying models

Speech is a non-stationary signal. The non-stationarity of the speech arises from emotional variations, speaker and environment variations. Almost all of the speech coding standards available today rely on stationary models for the modelling of time varying parameters of the speech generation model which affects the perceptional quality of the coded speech. Physically non-stationarity can be interpreted as the manifestation of the time varying nature of the speech generation source-the vocal tract. The vocal tract can be roughly modelled as a Autoregressive (AR) filter(all pole model). The time varying nature of the vocal tract corresponds to the time-varying AR parameters. The time varying AR parameters are expressed as the sum of Haar wavelets. Thus the estimation of time varying AR parameters boils down to that of finding the time invariant coefficients of the Haar wavelet basis functions. Here we propose a variable bit rate codec which attempts to bring in the non stationary modelling of the time varying AR parameters of the speech. Further long term prediction(LTP) and AbS method can be incorporated to develop a codec using this short term prediction method. It can be seen that the Haar wavelet based speech coding method over-performs the traditional method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)

自引率

0.00%

发文量