低比特率下正弦语音编码的新算法

1997 IEEE International Conference on Personal Wireless Communications (Cat. No.97TH8338) Pub Date : 1997-12-17 DOI:10.1109/ICPWC.1997.655478

S. Ahmadi, A. Spanias

{"title":"低比特率下正弦语音编码的新算法","authors":"S. Ahmadi, A. Spanias","doi":"10.1109/ICPWC.1997.655478","DOIUrl":null,"url":null,"abstract":"This paper addresses the design, development, evaluation, and implementation of efficient low bit rate speech coding algorithms based on the sinusoidal model. A series of algorithms have been developed for pitch frequency determination and voicing detection, simultaneous modeling of the sinusoidal amplitudes and phases, and mid-frame interpolation. An improved sinusoidal phase matching algorithm is presented, where short-time sinusoidal phases are approximated using an elaborate combination of linear prediction, spectral sampling, delay compensation, and phase correction techniques. A voicing-dependent perceptual split vector quantization scheme is used to encode the sinusoidal amplitudes. The perceptual properties of the human auditory system are effectively exploited in the developed algorithms. The algorithms have been successfully integrated into a 2.4 kbps sinusoidal coder. The performance of the 2.4 kbps coder has been evaluated in terms of subjective tests such as the mean opinion score and the diagnostic rhyme test, as well as some perceptually-motivated objective distortion measures. Performance analysis on a large speech database indicates that the use of the proposed algorithms resulted in considerable improvement in temporal and spectral signal matching, as well as improved subjective quality of the reproduced speech.","PeriodicalId":166667,"journal":{"name":"1997 IEEE International Conference on Personal Wireless Communications (Cat. No.97TH8338)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"New algorithms for sinusoidal speech coding at low bit rates\",\"authors\":\"S. Ahmadi, A. Spanias\",\"doi\":\"10.1109/ICPWC.1997.655478\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper addresses the design, development, evaluation, and implementation of efficient low bit rate speech coding algorithms based on the sinusoidal model. A series of algorithms have been developed for pitch frequency determination and voicing detection, simultaneous modeling of the sinusoidal amplitudes and phases, and mid-frame interpolation. An improved sinusoidal phase matching algorithm is presented, where short-time sinusoidal phases are approximated using an elaborate combination of linear prediction, spectral sampling, delay compensation, and phase correction techniques. A voicing-dependent perceptual split vector quantization scheme is used to encode the sinusoidal amplitudes. The perceptual properties of the human auditory system are effectively exploited in the developed algorithms. The algorithms have been successfully integrated into a 2.4 kbps sinusoidal coder. The performance of the 2.4 kbps coder has been evaluated in terms of subjective tests such as the mean opinion score and the diagnostic rhyme test, as well as some perceptually-motivated objective distortion measures. Performance analysis on a large speech database indicates that the use of the proposed algorithms resulted in considerable improvement in temporal and spectral signal matching, as well as improved subjective quality of the reproduced speech.\",\"PeriodicalId\":166667,\"journal\":{\"name\":\"1997 IEEE International Conference on Personal Wireless Communications (Cat. No.97TH8338)\",\"volume\":\"94 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"1997 IEEE International Conference on Personal Wireless Communications (Cat. No.97TH8338)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPWC.1997.655478\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"1997 IEEE International Conference on Personal Wireless Communications (Cat. No.97TH8338)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPWC.1997.655478","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文讨论了基于正弦模型的高效低比特率语音编码算法的设计、开发、评估和实现。已经开发了一系列的算法来确定基音频率和语音检测，正弦振幅和相位的同步建模，以及中间帧插值。提出了一种改进的正弦相位匹配算法，其中使用线性预测，频谱采样，延迟补偿和相位校正技术的精心组合来近似短时间正弦相位。采用与语音相关的感知分割矢量量化方案对正弦振幅进行编码。在开发的算法中，有效地利用了人类听觉系统的感知特性。该算法已成功集成到一个2.4 kbps的正弦编码器中。对2.4 kbps编码器的性能进行了主观测试，如平均意见得分和诊断韵律测试，以及一些感知动机的客观失真测量。在一个大型语音数据库上的性能分析表明，使用所提出的算法在时间和频谱信号匹配方面取得了相当大的改善，并且提高了再现语音的主观质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

New algorithms for sinusoidal speech coding at low bit rates

This paper addresses the design, development, evaluation, and implementation of efficient low bit rate speech coding algorithms based on the sinusoidal model. A series of algorithms have been developed for pitch frequency determination and voicing detection, simultaneous modeling of the sinusoidal amplitudes and phases, and mid-frame interpolation. An improved sinusoidal phase matching algorithm is presented, where short-time sinusoidal phases are approximated using an elaborate combination of linear prediction, spectral sampling, delay compensation, and phase correction techniques. A voicing-dependent perceptual split vector quantization scheme is used to encode the sinusoidal amplitudes. The perceptual properties of the human auditory system are effectively exploited in the developed algorithms. The algorithms have been successfully integrated into a 2.4 kbps sinusoidal coder. The performance of the 2.4 kbps coder has been evaluated in terms of subjective tests such as the mean opinion score and the diagnostic rhyme test, as well as some perceptually-motivated objective distortion measures. Performance analysis on a large speech database indicates that the use of the proposed algorithms resulted in considerable improvement in temporal and spectral signal matching, as well as improved subjective quality of the reproduced speech.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

1997 IEEE International Conference on Personal Wireless Communications (Cat. No.97TH8338)

自引率

0.00%

发文量