{"title":"基于Haar时变模型的语音短期预测方法","authors":"Deepak Joy, R. Kumar, S. Pathak","doi":"10.1109/ICCCT.2011.6075174","DOIUrl":null,"url":null,"abstract":"Speech is a non-stationary signal. The non-stationarity of the speech arises from emotional variations, speaker and environment variations. Almost all of the speech coding standards available today rely on stationary models for the modelling of time varying parameters of the speech generation model which affects the perceptional quality of the coded speech. Physically non-stationarity can be interpreted as the manifestation of the time varying nature of the speech generation source-the vocal tract. The vocal tract can be roughly modelled as a Autoregressive (AR) filter(all pole model). The time varying nature of the vocal tract corresponds to the time-varying AR parameters. The time varying AR parameters are expressed as the sum of Haar wavelets. Thus the estimation of time varying AR parameters boils down to that of finding the time invariant coefficients of the Haar wavelet basis functions. Here we propose a variable bit rate codec which attempts to bring in the non stationary modelling of the time varying AR parameters of the speech. Further long term prediction(LTP) and AbS method can be incorporated to develop a codec using this short term prediction method. It can be seen that the Haar wavelet based speech coding method over-performs the traditional method.","PeriodicalId":285986,"journal":{"name":"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel short term prediction method for speech using Haar based time varying models\",\"authors\":\"Deepak Joy, R. Kumar, S. Pathak\",\"doi\":\"10.1109/ICCCT.2011.6075174\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech is a non-stationary signal. The non-stationarity of the speech arises from emotional variations, speaker and environment variations. Almost all of the speech coding standards available today rely on stationary models for the modelling of time varying parameters of the speech generation model which affects the perceptional quality of the coded speech. Physically non-stationarity can be interpreted as the manifestation of the time varying nature of the speech generation source-the vocal tract. The vocal tract can be roughly modelled as a Autoregressive (AR) filter(all pole model). The time varying nature of the vocal tract corresponds to the time-varying AR parameters. The time varying AR parameters are expressed as the sum of Haar wavelets. Thus the estimation of time varying AR parameters boils down to that of finding the time invariant coefficients of the Haar wavelet basis functions. Here we propose a variable bit rate codec which attempts to bring in the non stationary modelling of the time varying AR parameters of the speech. Further long term prediction(LTP) and AbS method can be incorporated to develop a codec using this short term prediction method. It can be seen that the Haar wavelet based speech coding method over-performs the traditional method.\",\"PeriodicalId\":285986,\"journal\":{\"name\":\"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCCT.2011.6075174\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCT.2011.6075174","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A novel short term prediction method for speech using Haar based time varying models
Speech is a non-stationary signal. The non-stationarity of the speech arises from emotional variations, speaker and environment variations. Almost all of the speech coding standards available today rely on stationary models for the modelling of time varying parameters of the speech generation model which affects the perceptional quality of the coded speech. Physically non-stationarity can be interpreted as the manifestation of the time varying nature of the speech generation source-the vocal tract. The vocal tract can be roughly modelled as a Autoregressive (AR) filter(all pole model). The time varying nature of the vocal tract corresponds to the time-varying AR parameters. The time varying AR parameters are expressed as the sum of Haar wavelets. Thus the estimation of time varying AR parameters boils down to that of finding the time invariant coefficients of the Haar wavelet basis functions. Here we propose a variable bit rate codec which attempts to bring in the non stationary modelling of the time varying AR parameters of the speech. Further long term prediction(LTP) and AbS method can be incorporated to develop a codec using this short term prediction method. It can be seen that the Haar wavelet based speech coding method over-performs the traditional method.