用小波压缩语音和图像信号

W. Kinsner, A. Langi
{"title":"用小波压缩语音和图像信号","authors":"W. Kinsner, A. Langi","doi":"10.1109/WESCAN.1993.270520","DOIUrl":null,"url":null,"abstract":"The authors consider time-frequency multiresolution analysis based on wavelets, as it applies to speech/audio and image/video signal compression. They compare the wavelet analysis to the traditional short-window techniques used in signal compression. The performance of the discrete wavelet transform in terms of the bit rates and signal quality is comparable to that for other techniques such as the discrete cosine transform (DCT) for images and code-excited linear predictive coding (CELP) for speech, but with much less computational burden. Experiments with an image and Daubechies's four-coefficient wavelet show that truncation of wavelet coefficients as high as 90% still produces 30-dB peak signal-to-noise ratio (PSNR) quality. This is better than DCT. In an experiment on a male spoken sentence, the scheme reaches a 12.82-dB segmental signal-to-noise ratio (SEGSNR) at a rate of less than 4.8 kb/s. In comparison, the state-of-the-art CELP coding at 4.8 kbit/s can attain SEGSNR of 10-13 dB. Other experiments with images and Haar two-coefficient wavelet are also highlighted.<<ETX>>","PeriodicalId":146674,"journal":{"name":"IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings","volume":"51 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1993-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":"{\"title\":\"Speech and image signal compression with wavelets\",\"authors\":\"W. Kinsner, A. Langi\",\"doi\":\"10.1109/WESCAN.1993.270520\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The authors consider time-frequency multiresolution analysis based on wavelets, as it applies to speech/audio and image/video signal compression. They compare the wavelet analysis to the traditional short-window techniques used in signal compression. The performance of the discrete wavelet transform in terms of the bit rates and signal quality is comparable to that for other techniques such as the discrete cosine transform (DCT) for images and code-excited linear predictive coding (CELP) for speech, but with much less computational burden. Experiments with an image and Daubechies's four-coefficient wavelet show that truncation of wavelet coefficients as high as 90% still produces 30-dB peak signal-to-noise ratio (PSNR) quality. This is better than DCT. In an experiment on a male spoken sentence, the scheme reaches a 12.82-dB segmental signal-to-noise ratio (SEGSNR) at a rate of less than 4.8 kb/s. In comparison, the state-of-the-art CELP coding at 4.8 kbit/s can attain SEGSNR of 10-13 dB. Other experiments with images and Haar two-coefficient wavelet are also highlighted.<<ETX>>\",\"PeriodicalId\":146674,\"journal\":{\"name\":\"IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings\",\"volume\":\"51 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1993-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"49\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WESCAN.1993.270520\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE WESCANEX 93 Communications, Computers and Power in the Modern Environment - Conference Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WESCAN.1993.270520","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 49

摘要

作者考虑基于小波的时频多分辨率分析,因为它适用于语音/音频和图像/视频信号压缩。他们将小波分析与用于信号压缩的传统短窗技术进行了比较。离散小波变换在比特率和信号质量方面的性能与其他技术如图像的离散余弦变换(DCT)和语音的编码激发线性预测编码(CELP)相当,但计算负担要小得多。用图像和Daubechies四系数小波进行的实验表明,截断小波系数高达90%仍能产生30db的峰值信噪比(PSNR)质量。这比DCT好。在对男性口语句子的实验中,该方案在小于4.8 kb/s的速率下达到12.82 db的分段信噪比(SEGSNR)。相比之下,最先进的4.8 kbit/s的CELP编码可以实现10-13 dB的SEGSNR。本文还重点介绍了其他图像和哈尔二系数小波的实验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Speech and image signal compression with wavelets
The authors consider time-frequency multiresolution analysis based on wavelets, as it applies to speech/audio and image/video signal compression. They compare the wavelet analysis to the traditional short-window techniques used in signal compression. The performance of the discrete wavelet transform in terms of the bit rates and signal quality is comparable to that for other techniques such as the discrete cosine transform (DCT) for images and code-excited linear predictive coding (CELP) for speech, but with much less computational burden. Experiments with an image and Daubechies's four-coefficient wavelet show that truncation of wavelet coefficients as high as 90% still produces 30-dB peak signal-to-noise ratio (PSNR) quality. This is better than DCT. In an experiment on a male spoken sentence, the scheme reaches a 12.82-dB segmental signal-to-noise ratio (SEGSNR) at a rate of less than 4.8 kb/s. In comparison, the state-of-the-art CELP coding at 4.8 kbit/s can attain SEGSNR of 10-13 dB. Other experiments with images and Haar two-coefficient wavelet are also highlighted.<>
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信