Bandwidth Extension of Speech Using Perceptual Criteria

Bandwidth Extension of Speech Using Perceptual Criteria Pub Date : 2013-11-01 DOI:10.2200/S00535ED1V01Y201309ASE013

Visar Berisha, Steven Sandoval, J. Liss

{"title":"Bandwidth Extension of Speech Using Perceptual Criteria","authors":"Visar Berisha, Steven Sandoval, J. Liss","doi":"10.2200/S00535ED1V01Y201309ASE013","DOIUrl":null,"url":null,"abstract":"Bandwidth extension of speech is used in the International Telecommunication Union G.729.1 standard in which the narrowband bitstream is combined with quantized high band parameters. Although this system produces high quality wideband speech, the additional bits used to represent the high band can be further reduced. In addition to the algorithm used in the G.729.1 standard, bandwidth extension methods based on spectrum prediction have also been proposed. Although these algorithms do not require additional bits, they perform poorly when the correlation between the low and the high band is weak. In this dissertation, two wideband speech coding algorithms that rely on bandwidth extension are developed. The algorithms operate as wrappers around existing narrowband compression schemes. More specifically, in these algorithms, the low band is encoded using an existing toll-quality narrowband system, whereas the high band is generated using the proposed extension techniques. The first method relies only on transmitted high-band information to generate the wideband speech. The second algorithm uses a constrained minimum mean square error estimator that combines transmitted high-band envelope information with a predictive scheme driven by narrowband features. Both algorithms make use of novel perceptual models based on loudness that determine optimum quantization strategies for wideband recovery and synthesis. Objective and subjective evaluations reveal that the proposed system performs at a lower average bit rate while improving speech quality when compared to other similar algorithms.","PeriodicalId":312555,"journal":{"name":"Bandwidth Extension of Speech Using Perceptual Criteria","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bandwidth Extension of Speech Using Perceptual Criteria","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2200/S00535ED1V01Y201309ASE013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Bandwidth extension of speech is used in the International Telecommunication Union G.729.1 standard in which the narrowband bitstream is combined with quantized high band parameters. Although this system produces high quality wideband speech, the additional bits used to represent the high band can be further reduced. In addition to the algorithm used in the G.729.1 standard, bandwidth extension methods based on spectrum prediction have also been proposed. Although these algorithms do not require additional bits, they perform poorly when the correlation between the low and the high band is weak. In this dissertation, two wideband speech coding algorithms that rely on bandwidth extension are developed. The algorithms operate as wrappers around existing narrowband compression schemes. More specifically, in these algorithms, the low band is encoded using an existing toll-quality narrowband system, whereas the high band is generated using the proposed extension techniques. The first method relies only on transmitted high-band information to generate the wideband speech. The second algorithm uses a constrained minimum mean square error estimator that combines transmitted high-band envelope information with a predictive scheme driven by narrowband features. Both algorithms make use of novel perceptual models based on loudness that determine optimum quantization strategies for wideband recovery and synthesis. Objective and subjective evaluations reveal that the proposed system performs at a lower average bit rate while improving speech quality when compared to other similar algorithms.

查看原文本刊更多论文

基于感知标准的语音带宽扩展

国际电信联盟G.729.1标准采用窄带比特流与量化的高频带参数相结合的方式实现语音带宽扩展。虽然该系统可以产生高质量的宽带语音，但用于表示高频带的额外比特可以进一步减少。除了G.729.1标准中使用的算法外，还提出了基于频谱预测的带宽扩展方法。虽然这些算法不需要额外的比特，但当低频段和高频段之间的相关性较弱时，它们的性能很差。本文研究了两种基于带宽扩展的宽带语音编码算法。这些算法作为现有窄带压缩方案的包装。更具体地说，在这些算法中，使用现有的收费质量窄带系统对低频段进行编码，而使用建议的扩展技术生成高频段。第一种方法仅依靠传输的高频带信息来产生宽带语音。第二种算法使用约束最小均方误差估计器，将传输的高频带包络信息与窄带特征驱动的预测方案相结合。这两种算法都利用基于响度的新型感知模型来确定宽带恢复和合成的最佳量化策略。客观和主观评价表明，与其他类似算法相比，所提出的系统具有较低的平均比特率，同时提高了语音质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bandwidth Extension of Speech Using Perceptual Criteria

自引率

0.00%

发文量