1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)最新文献

筛选
英文 中文
Robust voice activity detection for DTX operation of speech coders 语音编码器DTX操作的鲁棒语音活动检测
F. Basbug, S. Nandkumar, K. Swaminathan
{"title":"Robust voice activity detection for DTX operation of speech coders","authors":"F. Basbug, S. Nandkumar, K. Swaminathan","doi":"10.1109/SCFT.1999.781483","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781483","url":null,"abstract":"Robust detection of voice activity for short-term speech frames is essential for discontinuous transmission (DTX) mode of operation of vocoders such as IS-641. A reference VAD for the IS-641 coder has been chosen for such a purpose and is based on the GSM-EFR (enhance full rate) VAD. We show by developing a comprehensive evaluation procedure that the reference VAD is sensitive to speech level variations. For example, a significant increase is seen in frames falsely classified as active at speech levels of 10 dB above or below nominal level. We propose a solution based on automatic gain control to reduce level sensitivity. Objective performance measures confirm the robustness of our proposed VAD.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134096726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
LSP quantization in wideband speech coders 宽带语音编码器中的LSP量化
M. Ferhaoui, S. Van Gerven
{"title":"LSP quantization in wideband speech coders","authors":"M. Ferhaoui, S. Van Gerven","doi":"10.1109/SCFT.1999.781472","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781472","url":null,"abstract":"This paper deals with multi-stage vector quantization of line spectrum pair (LSP) parameters in wideband speech coders and discusses commonly used spectral distortion measures and their relation to the perceptual quality of the speech coding.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131836912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Quantization of SEW and REW components for 3.6 kbit/s coding based on PWI 基于PWI的3.6 kbit/s编码中SEW和REW分量的量化
U. Bhaskar, S. Nandkumar, K. Swaminathan, G. Zakaria
{"title":"Quantization of SEW and REW components for 3.6 kbit/s coding based on PWI","authors":"U. Bhaskar, S. Nandkumar, K. Swaminathan, G. Zakaria","doi":"10.1109/SCFT.1999.781497","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781497","url":null,"abstract":"The design of a prototype waveform interpolation (PWI) based codec, operating at 3.6 kbit/s, is presented with main focus on the quantization of the slowly evolving waveform (SEW) and rapidly evolving waveform (REW) components. The SEW magnitude component is quantized using a hierarchical mean-shape-gain predictive vector quantization approach. SEW phase is derived using a phase model, based on a measure of voice periodicity. The REW magnitude is quantized using a gain and a sub-band based shape. The REW phase is obtained by high pass filtering a weighted combination of the SEW and a white noise process.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133174974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Performance of current perceptual objective speech quality measures 当前感知客观语音质量测量的性能
L. Thorpe, Wonho Yang
{"title":"Performance of current perceptual objective speech quality measures","authors":"L. Thorpe, Wonho Yang","doi":"10.1109/SCFT.1999.781512","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781512","url":null,"abstract":"This paper describes the performance of current objective speech quality measures designed to estimate subjective quality. We examined perceptual objective quality measures using a wide range of distortions including speech compression, wireless channel impairments, VoIP channel impairments, and modifications to the signal from features such as AGC. The results of this study indicate the range of conditions to which these objective measures may be applied, the validity of the estimates they provide, and the general maturity of the field.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121286945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 64
The adaptive multi-rate speech coder 自适应多速率语音编码器
E. Ekudden, R. Hagen, I. Johansson, J. Svedberg
{"title":"The adaptive multi-rate speech coder","authors":"E. Ekudden, R. Hagen, I. Johansson, J. Svedberg","doi":"10.1109/SCFT.1999.781503","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781503","url":null,"abstract":"In this paper, we describe the adaptive multi-rate (AMR) speech coder currently under standardization for GSM systems as part of the AMR speech service. The coder is a multi-rate ACELP coder with 8 modes operating at bit-rates from 12.2 kbit/s down to 4.75 kbit/s. The coder modes are integrated in a common structure where the bit-rate scalability is realized mainly by altering the quantization schemes for the different parameters. The coder provides seamless switching on 20 ms frame boundaries. The quality when used on GSM channels is significantly higher than for existing services.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125350422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 65
Advances in objective estimation of perceived speech quality 感知语音质量客观估计的研究进展
S. Voran
{"title":"Advances in objective estimation of perceived speech quality","authors":"S. Voran","doi":"10.1109/SCFT.1999.781510","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781510","url":null,"abstract":"We present two techniques that can be used to enhance objective estimators of perceived speech quality. Frame normalization and frame-energy plane partitioning are described and applied to a log-spectral-error-based estimator. The resulting estimators are compared with each other and with two established estimators. This is done through correlation with MOS values from 17 formal subjective tests. We find that the proposed techniques significantly improve the log-spectral-error-based estimator.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117271748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Study and subjective evaluation on MPEG-4 narrowband CELP coding under mobile communication conditions 移动通信条件下MPEG-4窄带CELP编码的研究与主观评价
K. Ozawa, T. Nomura, M. Serizawa, H. Ehara, K. Yoshida, N. Tana
{"title":"Study and subjective evaluation on MPEG-4 narrowband CELP coding under mobile communication conditions","authors":"K. Ozawa, T. Nomura, M. Serizawa, H. Ehara, K. Yoshida, N. Tana","doi":"10.1109/SCFT.1999.781507","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781507","url":null,"abstract":"This paper evaluates MPEG-4 narrowband (NB) CELP speech coding under various mobile communication conditions, such as clean, background noise and transmission errors. In order to make the codec robust against the errors with minimum increase of redundant bits, a CRC error correction code is attached into the codec as well as an error concealment is included in the decoder. Subjective evaluation results demonstrate that the speech quality for MPEG-4 speech coding at above 8.3 kb/s is higher than that for the ITU-T G.726 ADPCM at 32 kb/s in the clean speech condition. Further, the speech quality degradation is less than 0.1 in MOS under 10/sup -3/ bit error conditions, and still comparable to or higher than that for G.726 at 32 kb/s without error.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"225 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131444705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MVDR based all-pole modeling: properties, enhancements, and comparisons 基于MVDR的全极建模:属性、增强和比较
M. Murthi, B. Rao
{"title":"MVDR based all-pole modeling: properties, enhancements, and comparisons","authors":"M. Murthi, B. Rao","doi":"10.1109/SCFT.1999.781474","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781474","url":null,"abstract":"In this paper, we present several features of minimum variance distortionless response (MVDR) based all-pole filters which are suitable for modeling all types of speech. In particular, we demonstrate how the MVDR all-pole spectrum, based upon time-domain correlations, can provide high quality spectral envelope modeling of voiced speech. Simulation results are included showing that the MVDR all-pole spectrum's modeling of voiced speech harmonics improves as the model order increases, leading to a monotonically decreasing spectral distortion. Furthermore, we show how the MVDR all-pole envelope can be enhanced by using forward-backward linear prediction. In addition, low order (10-14) MVDR based all-pole filters are examined and compared with other all-pole spectral envelopes. The reduced order MVDR all-pole spectrum is shown to compare favorably with linear prediction (LP) and LP cubic spline spectral envelopes in terms of spectral modeling and complexity.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124993364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Embedded WI coding between 2.0 and 4.8 kbit/s 嵌入式WI编码2.0 ~ 4.8 kbit/s
Hong-Goo Kang, D. Sen
{"title":"Embedded WI coding between 2.0 and 4.8 kbit/s","authors":"Hong-Goo Kang, D. Sen","doi":"10.1109/SCFT.1999.781493","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781493","url":null,"abstract":"This paper describes an embedded speech coder based on waveform interpolation (WI) techniques. Since the quantization of line spectral frequency (LSF) parameters is fairly orthogonal to the quantization of excitation information, designing an embedded system with WI is much easier than that of other approaches. By using a hierarchical bit-allocation of excitation signals that consist of a slowly evolving waveform (SEW) and a rapidly evolving waveform (REW), the proposed system works well at the bit-rate of 2.0, 2.4, 3.0, 4.0 and 4.8 kbit/s. Listening tests indicate that the performance of the new system is comparable to an optimized fixed-rate WI coder, and the quality degrades gracefully as the bit-rate decreases.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132247626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信