{"title":"用小波压缩语音和图像信号","authors":"W. Kinsner, A. Langi","doi":"10.1109/WESCAN.1993.270520","DOIUrl":null,"url":null,"abstract":"The authors consider time-frequency multiresolution analysis based on wavelets, as it applies to speech/audio and image/video signal compression. They compare the wavelet analysis to the traditional short-window techniques used in signal compression. The performance of the discrete wavelet transform in terms of the bit rates and signal quality is comparable to that for other techniques such as the discrete cosine transform (DCT) for images and code-excited linear predictive coding (CELP) for speech, but with much less computational burden. Experiments with an image and Daubechies's four-coefficient wavelet show that truncation of wavelet coefficients as high as 90% still produces 30-dB peak signal-to-noise ratio (PSNR) quality. This is better than DCT. In an experiment on a male spoken sentence, the scheme reaches a 12.82-dB segmental signal-to-noise ratio (SEGSNR) at a rate of less than 4.8 kb/s. In comparison, the state-of-the-art CELP coding at 4.8 kbit/s can attain SEGSNR of 10-13 dB. Other experiments with images and Haar two-coefficient wavelet are also highlighted.<<ETX>>","PeriodicalId":146674,"journal":{"name":"IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings","volume":"51 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1993-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":"{\"title\":\"Speech and image signal compression with wavelets\",\"authors\":\"W. Kinsner, A. Langi\",\"doi\":\"10.1109/WESCAN.1993.270520\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The authors consider time-frequency multiresolution analysis based on wavelets, as it applies to speech/audio and image/video signal compression. They compare the wavelet analysis to the traditional short-window techniques used in signal compression. The performance of the discrete wavelet transform in terms of the bit rates and signal quality is comparable to that for other techniques such as the discrete cosine transform (DCT) for images and code-excited linear predictive coding (CELP) for speech, but with much less computational burden. Experiments with an image and Daubechies's four-coefficient wavelet show that truncation of wavelet coefficients as high as 90% still produces 30-dB peak signal-to-noise ratio (PSNR) quality. This is better than DCT. In an experiment on a male spoken sentence, the scheme reaches a 12.82-dB segmental signal-to-noise ratio (SEGSNR) at a rate of less than 4.8 kb/s. In comparison, the state-of-the-art CELP coding at 4.8 kbit/s can attain SEGSNR of 10-13 dB. Other experiments with images and Haar two-coefficient wavelet are also highlighted.<<ETX>>\",\"PeriodicalId\":146674,\"journal\":{\"name\":\"IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings\",\"volume\":\"51 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1993-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"49\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WESCAN.1993.270520\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WESCAN.1993.270520","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The authors consider time-frequency multiresolution analysis based on wavelets, as it applies to speech/audio and image/video signal compression. They compare the wavelet analysis to the traditional short-window techniques used in signal compression. The performance of the discrete wavelet transform in terms of the bit rates and signal quality is comparable to that for other techniques such as the discrete cosine transform (DCT) for images and code-excited linear predictive coding (CELP) for speech, but with much less computational burden. Experiments with an image and Daubechies's four-coefficient wavelet show that truncation of wavelet coefficients as high as 90% still produces 30-dB peak signal-to-noise ratio (PSNR) quality. This is better than DCT. In an experiment on a male spoken sentence, the scheme reaches a 12.82-dB segmental signal-to-noise ratio (SEGSNR) at a rate of less than 4.8 kb/s. In comparison, the state-of-the-art CELP coding at 4.8 kbit/s can attain SEGSNR of 10-13 dB. Other experiments with images and Haar two-coefficient wavelet are also highlighted.<>