{"title":"使用组合小波和WLPC表示的宽带音频压缩","authors":"Daryl Ning, Mohamed Deriche","doi":"10.1109/ISSPA.2001.950247","DOIUrl":null,"url":null,"abstract":"In this paper, we present results for a combined wavelet warped linear prediction (WLP) audio coder. In contrast to conventional LP, WLP allows for the control of frequency resolution to closely match the response of the human auditory system. The coder first uses WLP analysis on each frame of audio, and then applies a discrete wavelet transform (DWT) to the residual signal (prediction error). A psychoacoustic model is used in parallel to obtain a global masking threshold used in bit allocation. Bits are dynamically allocated to the DWT coefficients in an attempt to minimise the perceptually significant quantisation error. For monophonic signals sampled at 44.1 kHz, the coder achieves near transparent quality for a variety of speech and music signals at an average bit-rate of 64 kb/s. The power of the proposed coder resides in its easy scalability to lower bit rates.","PeriodicalId":236050,"journal":{"name":"Proceedings of the Sixth International Symposium on Signal Processing and its Applications (Cat.No.01EX467)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Wideband audio compression using a combined wavelet and WLPC representation\",\"authors\":\"Daryl Ning, Mohamed Deriche\",\"doi\":\"10.1109/ISSPA.2001.950247\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present results for a combined wavelet warped linear prediction (WLP) audio coder. In contrast to conventional LP, WLP allows for the control of frequency resolution to closely match the response of the human auditory system. The coder first uses WLP analysis on each frame of audio, and then applies a discrete wavelet transform (DWT) to the residual signal (prediction error). A psychoacoustic model is used in parallel to obtain a global masking threshold used in bit allocation. Bits are dynamically allocated to the DWT coefficients in an attempt to minimise the perceptually significant quantisation error. For monophonic signals sampled at 44.1 kHz, the coder achieves near transparent quality for a variety of speech and music signals at an average bit-rate of 64 kb/s. The power of the proposed coder resides in its easy scalability to lower bit rates.\",\"PeriodicalId\":236050,\"journal\":{\"name\":\"Proceedings of the Sixth International Symposium on Signal Processing and its Applications (Cat.No.01EX467)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Sixth International Symposium on Signal Processing and its Applications (Cat.No.01EX467)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSPA.2001.950247\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Sixth International Symposium on Signal Processing and its Applications (Cat.No.01EX467)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSPA.2001.950247","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Wideband audio compression using a combined wavelet and WLPC representation
In this paper, we present results for a combined wavelet warped linear prediction (WLP) audio coder. In contrast to conventional LP, WLP allows for the control of frequency resolution to closely match the response of the human auditory system. The coder first uses WLP analysis on each frame of audio, and then applies a discrete wavelet transform (DWT) to the residual signal (prediction error). A psychoacoustic model is used in parallel to obtain a global masking threshold used in bit allocation. Bits are dynamically allocated to the DWT coefficients in an attempt to minimise the perceptually significant quantisation error. For monophonic signals sampled at 44.1 kHz, the coder achieves near transparent quality for a variety of speech and music signals at an average bit-rate of 64 kb/s. The power of the proposed coder resides in its easy scalability to lower bit rates.