1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)最新文献

筛选
英文 中文
New speech enhancement techniques for low bit rate speech coding 低比特率语音编码的新语音增强技术
R. Martin, R. Cox
{"title":"New speech enhancement techniques for low bit rate speech coding","authors":"R. Martin, R. Cox","doi":"10.1109/SCFT.1999.781519","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781519","url":null,"abstract":"In this paper we present novel solutions for pre-processing noisy speech prior to low bit rate speech coding. We strive especially to improve the estimation of spectral parameters and to reduce the additional algorithmic delay caused by the enhancement pre-processor. While the former is achieved using a new adaptive limiting algorithm for the a priori signal-to-noise ratio (SNR) estimate, the latter makes use of a novel overlap/add scheme. Our enhancement techniques were evaluated in conjunction with the 2400 bps mixed excitation linear prediction (MELP) coder by means of formal and informal listening tests.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117061748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Trellis code excited linear prediction (TCELP) speech coding 网格码激励线性预测(TCELP)语音编码
Cheng-Chieh Lee, Y. Shoham
{"title":"Trellis code excited linear prediction (TCELP) speech coding","authors":"Cheng-Chieh Lee, Y. Shoham","doi":"10.1109/SCFT.1999.781500","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781500","url":null,"abstract":"This paper describes using the trellis-based scalar-vector quantizer for sources with memory to solve the excitation codebook search problem of code excited linear prediction (CELP) speech coders. This approach leads to a 24 kbit/s telephony-bandwidth low-delay (3 msec) trellis CELP coder, which outperforms both ITU-T 15 kbit/s G.728 LD-CELP and G.726 32 kbit/s ADPCM. Since the codebook is derived from a scalar alphabet, the proposed coder can effectively handle excitation vectors in the 24-dimensional space (to realize considerable vector quantization gains) and has a computational complexity of approximately 75% of that of ITU-T G.728 LD-CELP.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123715117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Perceptual zerotrees for scalable wavelet coding of wideband audio 宽带音频可扩展小波编码的感知零树
A. Aggarwal, V. Cuperman, K. Rose, A. Gersho
{"title":"Perceptual zerotrees for scalable wavelet coding of wideband audio","authors":"A. Aggarwal, V. Cuperman, K. Rose, A. Gersho","doi":"10.1109/SCFT.1999.781469","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781469","url":null,"abstract":"This paper introduces a new algorithm for scalable coding of wideband audio signals. The technique is based on quantization of bi-orthogonal wavelet transformed coefficients using a perceptual zerotree method. An initial zerotree estimate of the wavelet coefficients is computed, followed by scalar quantization of the coefficients according to perceptual thresholds. The choice of wavelet decomposition and encoding parameters for each frame is adapted to the source characteristics employing a rate distortion criterion. The scalability of the coder is due to the tree structure, which enables graceful degradation with decrease in bit rate. Preliminary subjective tests indicate near-transparent quality for average bit rates in the range of 1.5 to 2.5 bits per sample.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128675490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A wideband speech and audio codec at 16/24/32 kbit/s using hybrid ACELP/TCX techniques 采用混合ACELP/TCX技术的16/24/32 kbit/s宽带语音和音频编解码器
B. Bessette, R. Salami, C. Laflamme, R. Lefebvre
{"title":"A wideband speech and audio codec at 16/24/32 kbit/s using hybrid ACELP/TCX techniques","authors":"B. Bessette, R. Salami, C. Laflamme, R. Lefebvre","doi":"10.1109/SCFT.1999.781466","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781466","url":null,"abstract":"A hybrid ACELP/TCX algorithm for coding speech and music signals at 16, 24, and 32 kbit/s is presented. The algorithm switches between algebraic code excited linear prediction (ACELP) and transform coded excitation (TCX) modes on a 20-ms frame basis. Applying TCX on 20 ms frames improved the quality for music signals. Special care was taken to alleviate the switching artifacts between the two modes resulting in a transparent switching process. Subjective test results showed that for speech signals, the performance at 16, 24, and 32 kbit/s, is equivalent to G.722 at 48, 56, and 64 kbit/s, respectively. For music signals, the quality at 24 kbit/s was found equivalent to G.722 at 56 kbit/s. However, at 16 kbit/s, the quality for music was slightly lower than G.722 at 48 kbit/s.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124083253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Recovery of speech spectral parameters using convex set projection 基于凸集投影的语音频谱参数恢复
U. Visitkitjakarn, W. Chan, Yongyi Yang
{"title":"Recovery of speech spectral parameters using convex set projection","authors":"U. Visitkitjakarn, W. Chan, Yongyi Yang","doi":"10.1109/SCFT.1999.781475","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781475","url":null,"abstract":"Previous works have demonstrated that by preserving speech spectral \"dynamics\" during spectral parameter quantization and/or decoding, the quality of coded speech can be improved. We explore the use of projections onto convex sets (POCS) techniques to recover speech spectral parameters from their quantized versions. Unlike prior works, the POCS approach enables us to obtain solutions that satisfy precise constraints. Two constraint sets are used in our POCS recovery algorithm: one set constrains the \"roughness\" of the parameter trajectories, and the other set confines the parameters to the proper quantizer partition cells. Simulation of our algorithm has consistently produced improvements in both the subjective quality and objective distortion measurements.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133471527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Integration of speech enhancement and coding techniques 集成语音增强和编码技术
M. Kuropatwinski, D. Leckschat, K. Kroschel, A. Czyżewski
{"title":"Integration of speech enhancement and coding techniques","authors":"M. Kuropatwinski, D. Leckschat, K. Kroschel, A. Czyżewski","doi":"10.1109/SCFT.1999.781520","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781520","url":null,"abstract":"Speech coding techniques commonly used in low bit rate analysis-by-synthesis linear predictive coders (LPAS coders) can serve as a speech signal model emphasizing its important features. In the paper it is shown how this coding method can be utilized for speech enhancement. Particularly, the speech signal is modeled as the output of a cascade of an adaptive formant filter and a pitch filter, driven by a white Gaussian process with variance changing with time. A signal estimation method based on the Kalman filter is investigated which implements this speech signal model. The proposed approach yields significantly better performance both in SNR and subjective impression than Kalman filter methods, which use only short-time speech parameters.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114782115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Multi-rate wideband speech/channel codec based on MPEG-4/CELP for ETSI/GSM full-rate channel 基于MPEG-4/CELP的多速率宽带语音/信道编解码器,用于ETSI/GSM全速率信道
A. Murashima, M. Serizawa, K. Ozawa
{"title":"Multi-rate wideband speech/channel codec based on MPEG-4/CELP for ETSI/GSM full-rate channel","authors":"A. Murashima, M. Serizawa, K. Ozawa","doi":"10.1109/SCFT.1999.781470","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781470","url":null,"abstract":"This paper proposes a wideband multi-rate speech and channel codec based on the MPEG-4/CELP for the ETSI/GSM full-rate channel. In order to improve coding performance under mobile environments, such as channel error and background noise, the proposed codec operates at three bit allocations between speech and channel coding with a constant gross bit-rate of 22.8 kbit/s. The speech coding bit-rates are 10.9, 12.1 and 15.9 kbit/s. It achieves high speech quality under any channel condition by switching the bit allocations and also for noisy speech by using the highest bit-rate. The preliminary subjective evaluation tests show the speech quality is improved by switching the bit allocation under error conditions. It is also comparable of superior to ITU-T Recommendation G.722 48 kbit/s for carrier-to-interference ratios (C/I) higher than 10 dB. The codec at 15.9 kbit/s also gives comparable speech quality to G.722 at 48 kbit/s under background noise conditions.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123401849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
BEC++: a software tool for increased flexibility in algorithm development BEC++:增加算法开发灵活性的软件工具
M. Harton, K. Kapuscinski
{"title":"BEC++: a software tool for increased flexibility in algorithm development","authors":"M. Harton, K. Kapuscinski","doi":"10.1109/SCFT.1999.781486","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781486","url":null,"abstract":"Sometimes, there is little interest by algorithm developers in creating a fixed-point simulation from a floating-point algorithm. However, often it is vital that high levels of speech quality be maintained in a fixed-point application. The process of converting floating-point simulations to fixed-point is time consuming, expensive, and if not done well, a state-of-the-art algorithm may never see product implementation. There is a critical need for software tools that reduce the time and effort that algorithm developers spend on floating-point to fixed-point software conversion. Bit-Exact C++ (BEC++) is just such a tool. This paper discusses a fixed-point software implementation tool, BEC++, with syntax similar in look and feel to that of floating-point C. Based on the ETSI Bit-Exact C (BEC) software now commonly used in industry, BEC++ extends the capabilities of BEC through the introduction of C++ language features and object-oriented techniques. This paper also details how to use the software, providing comparisons between BEC++ and BEC implementations.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127475506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multiple-description coding (MDC) of speech with an invertible auditory model 基于可逆听觉模型的语音多描述编码
G. Kubin, W. Kleijn
{"title":"Multiple-description coding (MDC) of speech with an invertible auditory model","authors":"G. Kubin, W. Kleijn","doi":"10.1109/SCFT.1999.781491","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781491","url":null,"abstract":"Network signal processing aspects dominate in speech and audio coding applications such as Internet telephony or packet radio networks. We demonstrate that our approach to speech coding in a perceptual domain provides an implicit forward error concealment mechanism to handle random erasures of the channel. To this end, the individual acoustic subchannels of our auditory model are grouped into different transport subchannels or packets. Due to the strongly overlapping, redundant filterbank structure of the model, reconstruction of speech without audible degradation becomes possible even if a significant percentage of channels is erased (e.g., up to 40% in a 50-channel auditory model for narrowband speech). We discuss this result both from a hearing-physiology and a frame-theoretic perspective.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126713520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
A low bit rate codec for AMR standard AMR标准的低比特率编解码器
M. Foodeei, H. Zarrinkoub, R. Matmti, R. Rabipour, F. Gabin, S. Gosne
{"title":"A low bit rate codec for AMR standard","authors":"M. Foodeei, H. Zarrinkoub, R. Matmti, R. Rabipour, F. Gabin, S. Gosne","doi":"10.1109/SCFT.1999.781505","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781505","url":null,"abstract":"We describe a low bit rate speech codec based on the RCELP paradigm and designed as a candidate for GSM-AMR. The relaxation of the waveform-matching constraint in the RCELP model allows for reducing the bit rate without affecting the speech quality. New efficient quantization methods for the LSF and gain parameters coupled with some algorithmic improvements result in a high quality speech codec at bit rates as low as 4.55 kbit/s. Subjective tests show encouraging results in terms of quality and robustness under various operating conditions.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126866882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信