基于滤波器组设计和心理声学模型的语音压缩新技术

IF 0.9 4区工程技术 Q4 ACOUSTICS

International Journal of Acoustics and Vibration Pub Date : 2019-12-31 DOI:10.20855/ijav.2019.24.41455

M. Talbi, M. Bouhlel

{"title":"基于滤波器组设计和心理声学模型的语音压缩新技术","authors":"M. Talbi, M. Bouhlel","doi":"10.20855/ijav.2019.24.41455","DOIUrl":null,"url":null,"abstract":"In this paper, a new speech compression technique is proposed. This technique applies a Psychoacoustic Model and a general approach for Filter Bank Design using optimization. It is evaluated and compared with a compression technique using a MDCT (Modified Discrete Cosine Transform) Filter Bank of 32 Filters and a Psychoacoustic Model. This evaluation and comparison is performed by calculating bits before and after compression, PSNR (Peak Signal to Noise Ratio), NRMSE (Normalized Root Mean Square Error), SNR (Signal to Noise Ratio) and PESQ (Perceptual evaluation of speech quality) computations. The two techniques are tested and applied to a number of speech signals that are sampled at 8 kHz. The results obtained from this evaluation show that the proposed technique outperforms the second compression technique (based on a Psychoacoustic Model and MDCT filter Bank) in terms of Bits after compression and compression ratio. In fact, the proposed technique yields higher values for the compression ratio than the second compression technique. Moreover, the proposed compression technique presents reconstructed speech signals with acceptable perceptual qualities. This is justified by the values of SNR, PSNR and NRMSE and PESQ.","PeriodicalId":49185,"journal":{"name":"International Journal of Acoustics and Vibration","volume":"24 1","pages":"728-735"},"PeriodicalIF":0.9000,"publicationDate":"2019-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"New Speech Compression Technique based on Filter Bank Design and Psychoacoustic Model\",\"authors\":\"M. Talbi, M. Bouhlel\",\"doi\":\"10.20855/ijav.2019.24.41455\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, a new speech compression technique is proposed. This technique applies a Psychoacoustic Model and a general approach for Filter Bank Design using optimization. It is evaluated and compared with a compression technique using a MDCT (Modified Discrete Cosine Transform) Filter Bank of 32 Filters and a Psychoacoustic Model. This evaluation and comparison is performed by calculating bits before and after compression, PSNR (Peak Signal to Noise Ratio), NRMSE (Normalized Root Mean Square Error), SNR (Signal to Noise Ratio) and PESQ (Perceptual evaluation of speech quality) computations. The two techniques are tested and applied to a number of speech signals that are sampled at 8 kHz. The results obtained from this evaluation show that the proposed technique outperforms the second compression technique (based on a Psychoacoustic Model and MDCT filter Bank) in terms of Bits after compression and compression ratio. In fact, the proposed technique yields higher values for the compression ratio than the second compression technique. Moreover, the proposed compression technique presents reconstructed speech signals with acceptable perceptual qualities. This is justified by the values of SNR, PSNR and NRMSE and PESQ.\",\"PeriodicalId\":49185,\"journal\":{\"name\":\"International Journal of Acoustics and Vibration\",\"volume\":\"24 1\",\"pages\":\"728-735\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2019-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Acoustics and Vibration\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.20855/ijav.2019.24.41455\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Acoustics and Vibration","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.20855/ijav.2019.24.41455","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

摘要

本文提出了一种新的语音压缩技术。该技术采用心理声学模型和使用优化的滤波器组设计的一般方法。它被评估并与使用32个滤波器的MDCT(修正离散余弦变换)滤波器组和心理声学模型的压缩技术进行比较。这种评估和比较是通过计算压缩前后的比特、峰值信噪比(PSNR)、归一化均方根误差(NRMSE)、信噪比(SNR)和语音质量感知评价(PESQ)计算来完成的。对这两种技术进行了测试，并应用于以8 kHz采样的许多语音信号。评估结果表明，该技术在压缩后比特数和压缩比方面优于第二种压缩技术(基于心理声学模型和MDCT滤波器组)。实际上，所提出的技术比第二种压缩技术产生更高的压缩比值。此外，所提出的压缩技术可以呈现具有可接受感知质量的重构语音信号。信噪比、PSNR、NRMSE和PESQ的值证明了这一点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

New Speech Compression Technique based on Filter Bank Design and Psychoacoustic Model

In this paper, a new speech compression technique is proposed. This technique applies a Psychoacoustic Model and a general approach for Filter Bank Design using optimization. It is evaluated and compared with a compression technique using a MDCT (Modified Discrete Cosine Transform) Filter Bank of 32 Filters and a Psychoacoustic Model. This evaluation and comparison is performed by calculating bits before and after compression, PSNR (Peak Signal to Noise Ratio), NRMSE (Normalized Root Mean Square Error), SNR (Signal to Noise Ratio) and PESQ (Perceptual evaluation of speech quality) computations. The two techniques are tested and applied to a number of speech signals that are sampled at 8 kHz. The results obtained from this evaluation show that the proposed technique outperforms the second compression technique (based on a Psychoacoustic Model and MDCT filter Bank) in terms of Bits after compression and compression ratio. In fact, the proposed technique yields higher values for the compression ratio than the second compression technique. Moreover, the proposed compression technique presents reconstructed speech signals with acceptable perceptual qualities. This is justified by the values of SNR, PSNR and NRMSE and PESQ.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Acoustics and Vibration ACOUSTICS-ENGINEERING, MECHANICAL

CiteScore

1.60

自引率

10.00%

发文量

审稿时长

12 months

期刊介绍： The International Journal of Acoustics and Vibration (IJAV) is the refereed open-access journal of the International Institute of Acoustics and Vibration (IIAV). The IIAV is a non-profit international scientific society founded in 1995. The primary objective of the Institute is to advance the science of acoustics and vibration by creating an international organization that is responsive to the needs of scientists and engineers concerned with acoustics and vibration problems all around the world. Manuscripts of articles, technical notes and letters-to-the-editor should be submitted to the Editor-in-Chief via the on-line submission system. Authors wishing to submit an article need to log in on the IJAV website first. Users logged into the website are able to submit new articles, track the status of their articles already submitted, upload revised articles, responses and/or rebuttals to reviewers, figures, biographies, photographs, copyright transfer agreements, and send comments to the editor. Each time the status of an article submitted changes, the author will also be notified automatically by email. IIAV members (in good standing for at least six months) can publish in IJAV free of charge and their papers will be displayed on-line immediately after they have been edited and laid-out. Non-IIAV members will be required to pay a mandatory Article Processing Charge (APC) of $200 USD if the manuscript is accepted for publication after review. The APC fee allows IIAV to make your research freely available to all readers using the Open Access model. In addition, Non-IIAV members who pay an extra voluntary publication fee (EVPF) of $500 USD will be granted expedited publication in the IJAV Journal and their papers can be displayed on the Internet after acceptance. If the $200 USD (APC) publication fee is not honored, papers will not be published. Authors who do not pay the voluntary fixed fee of $500 USD will have their papers published but there may be a considerable delay. The English text of the papers must be of high quality. If the text submitted is of low quality the manuscript will be more than likely rejected. For authors whose first language is not English, we recommend having their manuscripts reviewed and edited prior to submission by a native English speaker with scientific expertise. There are many commercial editing services which can provide this service at a cost to the authors.