用小波压缩语音和图像信号

IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings Pub Date : 1993-05-17 DOI:10.1109/WESCAN.1993.270520

W. Kinsner, A. Langi

{"title":"用小波压缩语音和图像信号","authors":"W. Kinsner, A. Langi","doi":"10.1109/WESCAN.1993.270520","DOIUrl":null,"url":null,"abstract":"The authors consider time-frequency multiresolution analysis based on wavelets, as it applies to speech/audio and image/video signal compression. They compare the wavelet analysis to the traditional short-window techniques used in signal compression. The performance of the discrete wavelet transform in terms of the bit rates and signal quality is comparable to that for other techniques such as the discrete cosine transform (DCT) for images and code-excited linear predictive coding (CELP) for speech, but with much less computational burden. Experiments with an image and Daubechies's four-coefficient wavelet show that truncation of wavelet coefficients as high as 90% still produces 30-dB peak signal-to-noise ratio (PSNR) quality. This is better than DCT. In an experiment on a male spoken sentence, the scheme reaches a 12.82-dB segmental signal-to-noise ratio (SEGSNR) at a rate of less than 4.8 kb/s. In comparison, the state-of-the-art CELP coding at 4.8 kbit/s can attain SEGSNR of 10-13 dB. Other experiments with images and Haar two-coefficient wavelet are also highlighted.<<ETX>>","PeriodicalId":146674,"journal":{"name":"IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings","volume":"51 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1993-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":"{\"title\":\"Speech and image signal compression with wavelets\",\"authors\":\"W. Kinsner, A. Langi\",\"doi\":\"10.1109/WESCAN.1993.270520\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The authors consider time-frequency multiresolution analysis based on wavelets, as it applies to speech/audio and image/video signal compression. They compare the wavelet analysis to the traditional short-window techniques used in signal compression. The performance of the discrete wavelet transform in terms of the bit rates and signal quality is comparable to that for other techniques such as the discrete cosine transform (DCT) for images and code-excited linear predictive coding (CELP) for speech, but with much less computational burden. Experiments with an image and Daubechies's four-coefficient wavelet show that truncation of wavelet coefficients as high as 90% still produces 30-dB peak signal-to-noise ratio (PSNR) quality. This is better than DCT. In an experiment on a male spoken sentence, the scheme reaches a 12.82-dB segmental signal-to-noise ratio (SEGSNR) at a rate of less than 4.8 kb/s. In comparison, the state-of-the-art CELP coding at 4.8 kbit/s can attain SEGSNR of 10-13 dB. Other experiments with images and Haar two-coefficient wavelet are also highlighted.<<ETX>>\",\"PeriodicalId\":146674,\"journal\":{\"name\":\"IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings\",\"volume\":\"51 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1993-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"49\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WESCAN.1993.270520\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WESCAN.1993.270520","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 49

摘要

作者考虑基于小波的时频多分辨率分析，因为它适用于语音/音频和图像/视频信号压缩。他们将小波分析与用于信号压缩的传统短窗技术进行了比较。离散小波变换在比特率和信号质量方面的性能与其他技术如图像的离散余弦变换(DCT)和语音的编码激发线性预测编码(CELP)相当，但计算负担要小得多。用图像和Daubechies四系数小波进行的实验表明，截断小波系数高达90%仍能产生30db的峰值信噪比(PSNR)质量。这比DCT好。在对男性口语句子的实验中，该方案在小于4.8 kb/s的速率下达到12.82 db的分段信噪比(SEGSNR)。相比之下，最先进的4.8 kbit/s的CELP编码可以实现10-13 dB的SEGSNR。本文还重点介绍了其他图像和哈尔二系数小波的实验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Speech and image signal compression with wavelets

The authors consider time-frequency multiresolution analysis based on wavelets, as it applies to speech/audio and image/video signal compression. They compare the wavelet analysis to the traditional short-window techniques used in signal compression. The performance of the discrete wavelet transform in terms of the bit rates and signal quality is comparable to that for other techniques such as the discrete cosine transform (DCT) for images and code-excited linear predictive coding (CELP) for speech, but with much less computational burden. Experiments with an image and Daubechies's four-coefficient wavelet show that truncation of wavelet coefficients as high as 90% still produces 30-dB peak signal-to-noise ratio (PSNR) quality. This is better than DCT. In an experiment on a male spoken sentence, the scheme reaches a 12.82-dB segmental signal-to-noise ratio (SEGSNR) at a rate of less than 4.8 kb/s. In comparison, the state-of-the-art CELP coding at 4.8 kbit/s can attain SEGSNR of 10-13 dB. Other experiments with images and Haar two-coefficient wavelet are also highlighted.<>

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings

自引率

0.00%

发文量