{"title":"Wideband audio compression using a combined wavelet and WLPC representation","authors":"Daryl Ning, Mohamed Deriche","doi":"10.1109/ISSPA.2001.950247","DOIUrl":null,"url":null,"abstract":"In this paper, we present results for a combined wavelet warped linear prediction (WLP) audio coder. In contrast to conventional LP, WLP allows for the control of frequency resolution to closely match the response of the human auditory system. The coder first uses WLP analysis on each frame of audio, and then applies a discrete wavelet transform (DWT) to the residual signal (prediction error). A psychoacoustic model is used in parallel to obtain a global masking threshold used in bit allocation. Bits are dynamically allocated to the DWT coefficients in an attempt to minimise the perceptually significant quantisation error. For monophonic signals sampled at 44.1 kHz, the coder achieves near transparent quality for a variety of speech and music signals at an average bit-rate of 64 kb/s. The power of the proposed coder resides in its easy scalability to lower bit rates.","PeriodicalId":236050,"journal":{"name":"Proceedings of the Sixth International Symposium on Signal Processing and its Applications (Cat.No.01EX467)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Sixth International Symposium on Signal Processing and its Applications (Cat.No.01EX467)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSPA.2001.950247","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In this paper, we present results for a combined wavelet warped linear prediction (WLP) audio coder. In contrast to conventional LP, WLP allows for the control of frequency resolution to closely match the response of the human auditory system. The coder first uses WLP analysis on each frame of audio, and then applies a discrete wavelet transform (DWT) to the residual signal (prediction error). A psychoacoustic model is used in parallel to obtain a global masking threshold used in bit allocation. Bits are dynamically allocated to the DWT coefficients in an attempt to minimise the perceptually significant quantisation error. For monophonic signals sampled at 44.1 kHz, the coder achieves near transparent quality for a variety of speech and music signals at an average bit-rate of 64 kb/s. The power of the proposed coder resides in its easy scalability to lower bit rates.