一种基于增强混合激励线性预测的变比特率语音编码算法

Ye Li, Qiuyun Hao, P. Zhang, Jingsai Jiang, Xiaofeng Ma, Yanhong Fan, H. V. Davydau
{"title":"一种基于增强混合激励线性预测的变比特率语音编码算法","authors":"Ye Li, Qiuyun Hao, P. Zhang, Jingsai Jiang, Xiaofeng Ma, Yanhong Fan, H. V. Davydau","doi":"10.1109/CISP-BMEI.2016.7852841","DOIUrl":null,"url":null,"abstract":"In order to improve the channel bandwidth utilization of voice communication, a variable bit rate speech coding algorithm based on enhanced mixed excitation linear prediction (MELPe) is proposed in the paper. In voice communication, only about 40% of the time is occupied by talking, whereas the rest is engaged by silence or background noise. In addition, unvoiced frame usually requires less transmission rate than the voiced one in low bit rate speech coding algorithms. Therefore, always using the same coding bit rate for speech coding is a waste of channel resource. In this paper, the input signal is divided into speech and silence by using voice activity detection (VAD) technology. And the speech frames are divided into voiced frame or unvoiced frame. They use different coding rates for speech coding and data transmission. All of the parameters are encoded, transmitted and decoded in voiced frame. Only gain parameters, LSF parameters, pitch parameters and overall voicing are encoded, transmitted and decoded in the unvoiced frame. Furthermore, only the gain parameters and the first level LSF parameters are encoded, transmitted and decoded in the silence frame. When about 40% of the time is occupied by talking, compare with the traditional 2.4 kbps MELPe vocoder, the average coding rate of the proposed variable bit rate vocoder can reach 1.33 kbps. But they can achieve the same quality of synthetic speech. Experimental results show that the proposed method reduces the average coding rate, and the synthetic background noise has good comfort on the subjective sense of hearing.","PeriodicalId":275095,"journal":{"name":"2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"A variable-bit-rate speech coding algorithm based on enhanced mixed excitation linear prediction\",\"authors\":\"Ye Li, Qiuyun Hao, P. Zhang, Jingsai Jiang, Xiaofeng Ma, Yanhong Fan, H. V. Davydau\",\"doi\":\"10.1109/CISP-BMEI.2016.7852841\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to improve the channel bandwidth utilization of voice communication, a variable bit rate speech coding algorithm based on enhanced mixed excitation linear prediction (MELPe) is proposed in the paper. In voice communication, only about 40% of the time is occupied by talking, whereas the rest is engaged by silence or background noise. In addition, unvoiced frame usually requires less transmission rate than the voiced one in low bit rate speech coding algorithms. Therefore, always using the same coding bit rate for speech coding is a waste of channel resource. In this paper, the input signal is divided into speech and silence by using voice activity detection (VAD) technology. And the speech frames are divided into voiced frame or unvoiced frame. They use different coding rates for speech coding and data transmission. All of the parameters are encoded, transmitted and decoded in voiced frame. Only gain parameters, LSF parameters, pitch parameters and overall voicing are encoded, transmitted and decoded in the unvoiced frame. Furthermore, only the gain parameters and the first level LSF parameters are encoded, transmitted and decoded in the silence frame. When about 40% of the time is occupied by talking, compare with the traditional 2.4 kbps MELPe vocoder, the average coding rate of the proposed variable bit rate vocoder can reach 1.33 kbps. But they can achieve the same quality of synthetic speech. Experimental results show that the proposed method reduces the average coding rate, and the synthetic background noise has good comfort on the subjective sense of hearing.\",\"PeriodicalId\":275095,\"journal\":{\"name\":\"2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)\",\"volume\":\"72 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISP-BMEI.2016.7852841\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISP-BMEI.2016.7852841","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

为了提高语音通信的信道带宽利用率,提出了一种基于增强混合激励线性预测(MELPe)的变比特率语音编码算法。在语音交流中,只有大约40%的时间是在说话,而其余的时间都被沉默或背景噪音所占据。此外,在低比特率语音编码算法中,非浊音帧通常比浊音帧需要更低的传输速率。因此,总是使用相同的编码码率进行语音编码是对信道资源的浪费。本文采用语音活动检测(VAD)技术,将输入信号分为语音信号和静音信号。语音帧分为浊音帧和非浊音帧。它们使用不同的编码速率进行语音编码和数据传输。所有的参数都在浊音帧中进行编码、传输和解码。在非浊音帧中,只有增益参数、LSF参数、音高参数和整体发声进行编码、传输和解码。在静默帧中,只有增益参数和一级LSF参数被编码、传输和解码。当通话占用约40%的时间时,与传统的2.4 kbps MELPe声码器相比,本文提出的可变比特率声码器的平均编码速率可以达到1.33 kbps。但它们可以达到与合成语音相同的质量。实验结果表明,该方法降低了平均编码率,合成背景噪声对主观听觉有较好的舒适性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A variable-bit-rate speech coding algorithm based on enhanced mixed excitation linear prediction
In order to improve the channel bandwidth utilization of voice communication, a variable bit rate speech coding algorithm based on enhanced mixed excitation linear prediction (MELPe) is proposed in the paper. In voice communication, only about 40% of the time is occupied by talking, whereas the rest is engaged by silence or background noise. In addition, unvoiced frame usually requires less transmission rate than the voiced one in low bit rate speech coding algorithms. Therefore, always using the same coding bit rate for speech coding is a waste of channel resource. In this paper, the input signal is divided into speech and silence by using voice activity detection (VAD) technology. And the speech frames are divided into voiced frame or unvoiced frame. They use different coding rates for speech coding and data transmission. All of the parameters are encoded, transmitted and decoded in voiced frame. Only gain parameters, LSF parameters, pitch parameters and overall voicing are encoded, transmitted and decoded in the unvoiced frame. Furthermore, only the gain parameters and the first level LSF parameters are encoded, transmitted and decoded in the silence frame. When about 40% of the time is occupied by talking, compare with the traditional 2.4 kbps MELPe vocoder, the average coding rate of the proposed variable bit rate vocoder can reach 1.33 kbps. But they can achieve the same quality of synthetic speech. Experimental results show that the proposed method reduces the average coding rate, and the synthetic background noise has good comfort on the subjective sense of hearing.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信