基于Haar时变模型的语音短期预测方法

Deepak Joy, R. Kumar, S. Pathak
{"title":"基于Haar时变模型的语音短期预测方法","authors":"Deepak Joy, R. Kumar, S. Pathak","doi":"10.1109/ICCCT.2011.6075174","DOIUrl":null,"url":null,"abstract":"Speech is a non-stationary signal. The non-stationarity of the speech arises from emotional variations, speaker and environment variations. Almost all of the speech coding standards available today rely on stationary models for the modelling of time varying parameters of the speech generation model which affects the perceptional quality of the coded speech. Physically non-stationarity can be interpreted as the manifestation of the time varying nature of the speech generation source-the vocal tract. The vocal tract can be roughly modelled as a Autoregressive (AR) filter(all pole model). The time varying nature of the vocal tract corresponds to the time-varying AR parameters. The time varying AR parameters are expressed as the sum of Haar wavelets. Thus the estimation of time varying AR parameters boils down to that of finding the time invariant coefficients of the Haar wavelet basis functions. Here we propose a variable bit rate codec which attempts to bring in the non stationary modelling of the time varying AR parameters of the speech. Further long term prediction(LTP) and AbS method can be incorporated to develop a codec using this short term prediction method. It can be seen that the Haar wavelet based speech coding method over-performs the traditional method.","PeriodicalId":285986,"journal":{"name":"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel short term prediction method for speech using Haar based time varying models\",\"authors\":\"Deepak Joy, R. Kumar, S. Pathak\",\"doi\":\"10.1109/ICCCT.2011.6075174\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech is a non-stationary signal. The non-stationarity of the speech arises from emotional variations, speaker and environment variations. Almost all of the speech coding standards available today rely on stationary models for the modelling of time varying parameters of the speech generation model which affects the perceptional quality of the coded speech. Physically non-stationarity can be interpreted as the manifestation of the time varying nature of the speech generation source-the vocal tract. The vocal tract can be roughly modelled as a Autoregressive (AR) filter(all pole model). The time varying nature of the vocal tract corresponds to the time-varying AR parameters. The time varying AR parameters are expressed as the sum of Haar wavelets. Thus the estimation of time varying AR parameters boils down to that of finding the time invariant coefficients of the Haar wavelet basis functions. Here we propose a variable bit rate codec which attempts to bring in the non stationary modelling of the time varying AR parameters of the speech. Further long term prediction(LTP) and AbS method can be incorporated to develop a codec using this short term prediction method. It can be seen that the Haar wavelet based speech coding method over-performs the traditional method.\",\"PeriodicalId\":285986,\"journal\":{\"name\":\"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCCT.2011.6075174\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCT.2011.6075174","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

语音是一种非平稳信号。言语的非平稳性源于情绪的变化、说话者的变化和环境的变化。目前几乎所有可用的语音编码标准都依赖于平稳模型来建模语音生成模型的时变参数,这些参数会影响编码语音的感知质量。物理上的非平稳性可以解释为语音产生源-声道的时变性质的表现。声道可以粗略地建模为一个自回归(AR)滤波器(全极模型)。声道的时变性质与时变的AR参数相对应。时变的AR参数表示为Haar小波的和。因此,时变AR参数的估计可以归结为寻找Haar小波基函数的时不变系数。在这里,我们提出了一个可变比特率编解码器,它试图引入语音时变AR参数的非平稳建模。进一步的长期预测(LTP)和AbS方法可以结合使用这种短期预测方法开发编解码器。可以看出,基于Haar小波的语音编码方法优于传统的语音编码方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A novel short term prediction method for speech using Haar based time varying models
Speech is a non-stationary signal. The non-stationarity of the speech arises from emotional variations, speaker and environment variations. Almost all of the speech coding standards available today rely on stationary models for the modelling of time varying parameters of the speech generation model which affects the perceptional quality of the coded speech. Physically non-stationarity can be interpreted as the manifestation of the time varying nature of the speech generation source-the vocal tract. The vocal tract can be roughly modelled as a Autoregressive (AR) filter(all pole model). The time varying nature of the vocal tract corresponds to the time-varying AR parameters. The time varying AR parameters are expressed as the sum of Haar wavelets. Thus the estimation of time varying AR parameters boils down to that of finding the time invariant coefficients of the Haar wavelet basis functions. Here we propose a variable bit rate codec which attempts to bring in the non stationary modelling of the time varying AR parameters of the speech. Further long term prediction(LTP) and AbS method can be incorporated to develop a codec using this short term prediction method. It can be seen that the Haar wavelet based speech coding method over-performs the traditional method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信