1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)最新文献

筛选
英文 中文
Speaker adaptation in a phonetic vocoding environment 语音编码环境下的说话人适应
C. Ribeiro, I. Trancoso
{"title":"Speaker adaptation in a phonetic vocoding environment","authors":"C. Ribeiro, I. Trancoso","doi":"10.1109/SCFT.1999.781485","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781485","url":null,"abstract":"The coder proposed in this paper falls in the class of segmental vocoders known as phonetic vocoders. Speaker recognisability is one of the main problems faced by vocoders at the lowest bit rates, given the need to reduce speaker specific information. Hence, phonetic vocoders are very suitable to speaker dependent coding, and can achieve bit rates as low as 250 bit/s. For speaker independent coding a speaker adaptation methodology is adopted, although resulting in higher bit rates to transmit the speaker specific information. In order to further reduce the corresponding bit rate, a new method is proposed that explores the intra-speaker correlation for the same phone.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125165146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Coding the line spectral frequencies by jointly optimized MA prediction and vector quantization 采用联合优化的MA预测和矢量量化对线谱频率进行编码
Y. Shoham
{"title":"Coding the line spectral frequencies by jointly optimized MA prediction and vector quantization","authors":"Y. Shoham","doi":"10.1109/SCFT.1999.781479","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781479","url":null,"abstract":"This paper presents a method for designing and optimizing predictive vector quantizers (PVQ) for coding the line spectral frequencies (LSF) in LPC-based speech and audio coders. The algorithm is based on iterative optimization of the predictors and the vector-quantizer codebooks. It is shown that the proposed method yields high quality LSF predictive quantizers with performance exceeding that of the PVQ used in the G.729 standard.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116835661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Reverse water-filling in predictive encoding of speech 语音预测编码中的反向充水
S. Andersen, W. Kleijn
{"title":"Reverse water-filling in predictive encoding of speech","authors":"S. Andersen, W. Kleijn","doi":"10.1109/SCFT.1999.781499","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781499","url":null,"abstract":"Reverse water-filling suggests that, at low bit rates, the synthesis filter for predictive encoding should differ from the model filter of the signal to be encoded. However, reverse water-filling follows from optimum encoding and stationary Gaussian assumptions. By means of simple experiments, we show that reverse water-filling applies to predictive encoding of speech. For a vector analysis-by-synthesis encoding based on a first order autoregressive signal model, the use of a synthesis filter derived from reverse water-filling resulted in consistently improved segmental SNR measures.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114480562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Adapting waveform interpolation (with pitch-spaced subbands) for quantisation 采用波形插值(带间距子带)进行量化
N. R. Chong, I. Burnett, J. Chicharo
{"title":"Adapting waveform interpolation (with pitch-spaced subbands) for quantisation","authors":"N. R. Chong, I. Burnett, J. Chicharo","doi":"10.1109/SCFT.1999.781496","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781496","url":null,"abstract":"Adaptation of the waveform interpolation (WI) paradigm to allow waveform coding of the speech signals was reported by Kleijn et al. (see Proc. 5/sup th/ Int. Conf. Spoken Language Processing, Sydney, Australia, Dec. 1998). However, since the signal is time-warped to a constant pitch, processing the surface derived from the new technique is extremely dependent on having an accurate pitch track. In order to facilitate vector quantisation techniques, it is necessary to manipulate the pitch track to ensure phase-alignment of critically sampled pitch periods. In addition, pitch cycles following unvoiced segments must also carry the same phase offset. The adjusted pitch track is used to facilitate a re-warping of the residual signal. The effects of warping and pitch inaccuracies on the transformed result of the warped periods are also discussed.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129397668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Robust speech transmission over noisy channels employing non-linear block codes 采用非线性分组码的噪声信道上的鲁棒语音传输
S. Heinen, S. Bleck, P. Vary
{"title":"Robust speech transmission over noisy channels employing non-linear block codes","authors":"S. Heinen, S. Bleck, P. Vary","doi":"10.1109/SCFT.1999.781488","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781488","url":null,"abstract":"In medium to low bit rate speech codecs the speech signal is represented by a set of parameters. The most important concept is presently code excited linear predictive (CELP) coding. A speech segment of typically 10 to 20 ms is described in terms of prediction coefficients, gain factors and excitation vectors. Due to the high compression rates (0.5-1.5 bits per speech sample) the parameters are partly highly sensitive against channel noise. In this paper we present a new error protection technique, that is based on a joint optimization of parameter quantization and a redundant non-linear block coding scheme. For parameter reconstruction, the principle of soft bit source decoding is applied. The proposed technique can be used in combination with conventional error protection such as convolutional coding and allows a flexible subdivision of the gross data rate for source coding and error protection.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122045932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Fast spherical code decoding algorithms for the residual codebook in CELP coders CELP编码器中剩余码本的快速球形码解码算法
K. Koppinen, T. Mikkonen
{"title":"Fast spherical code decoding algorithms for the residual codebook in CELP coders","authors":"K. Koppinen, T. Mikkonen","doi":"10.1109/SCFT.1999.781501","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781501","url":null,"abstract":"A new method for fast decoding when using algebraic codes for the fixed codebook in CELP speech coders is presented. This method is based on the trellis structure of a block code, and allows fast optimal search of the residual codebook even with a combined scalar gain, unlike previous search methods. The method is flexible, allowing for long block lengths and the use of any code including nonlinear ones. Currently the performance is not as high as with standard algebraic coding methods, but further refinements may make this a viable method.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129471156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An adaptive post-filtering technique based on a least squares approach 基于最小二乘法的自适应后滤波技术
A. Mustapha, S. Yeldener
{"title":"An adaptive post-filtering technique based on a least squares approach","authors":"A. Mustapha, S. Yeldener","doi":"10.1109/SCFT.1999.781516","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781516","url":null,"abstract":"This paper presents an adaptive time-domain post-filtering technique based on the least squares approach and modified Yule-Walker (MYW) filter. Conventionally, post-filtering is derived from an original LPC spectrum. In general, this time-domain technique produces unpredictable spectral tilt that is hard to control by the modified LPC synthesis, inverse and high pass filtering and hence introduces muffling in the speech quality. Other approaches of designing post-filters were developed in the frequency domain which can only be used in sinusoidal based speech coders. We have also developed a new time-domain post-filtering technique which eliminates the problem of spectral tilt in the speech spectrum and can be applied to various speech coders. The new post-filter has a flat frequency response at the formant peaks of the speech spectrum. This post-filtering technique has been used in a 4 kb/s harmonic excitation linear predictive coder (HE-LPC) and subjective listening tests have indicated that this technique outperforms the conventional one in both one and two tandem connections.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121421104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Post noise smoother to improve low bit rate speech-coding performance 后置噪声平滑,提高低比特率语音编码性能
H. Tasaki, S. Takahashi
{"title":"Post noise smoother to improve low bit rate speech-coding performance","authors":"H. Tasaki, S. Takahashi","doi":"10.1109/SCFT.1999.781517","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781517","url":null,"abstract":"A new post-process called a post noise smoother (PNS) for the CELP decoder is proposed in order to improve low bit rate speech-coding performance under various background noise conditions. In the PNS, spectral amplitude smoothing and phase randomizing are performed on the decoded speech in order to obtain smoothed background noise. The decoded speech, the smoothed signal, and an automatically generated imitative noise signal are multiplied by adaptive gains and are summed up in the final output speech. These gains are computed from each frame's estimated ratio of background noise to signal. Evaluation test results show that the PNS significantly improves the subjective quality of a 4-kbps speech coder under various conditions of background noise.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114894902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A multimode transform predictive coder (MTPC) for speech and audio 用于语音和音频的多模变换预测编码器(MTPC)
S. Ramprashad
{"title":"A multimode transform predictive coder (MTPC) for speech and audio","authors":"S. Ramprashad","doi":"10.1109/SCFT.1999.781467","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781467","url":null,"abstract":"Speech and audio coding are often considered to be two separate technologies, each almost independently developing different techniques for signal compression. At low bit rates the gap in performance between the two technologies begins to be noticeable; speech coders work better on speech and audio coders perform better on music. The challenge is to merge the two technologies into a single coding paradigm which will work as well as either two regardless of the input signal. Presented is a multimode speech and audio coder which can adapt almost continuously between a speech and audio coding mode. This multimode transform predictive coder (MTPC) shows improved performance on both speech and audio inputs when compared to a single-mode transform predictive coder (TPC).","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122715632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
A computational model for MOS prediction MOS预测的计算模型
Doh-Suk Kim, O. Ghitza, P. Kroon
{"title":"A computational model for MOS prediction","authors":"Doh-Suk Kim, O. Ghitza, P. Kroon","doi":"10.1109/SCFT.1999.781511","DOIUrl":"https://doi.org/10.1109/SCFT.1999.781511","url":null,"abstract":"A computational model to predict MOS (mean opinion score) of processed speech is proposed. The system measures the distortion of processed speech (compared to the source speech) using a peripheral model of the mammalian auditory system and a psychophysically-inspired measure, and maps the distortion value onto the MOS scale. This paper describes our attempt to derive a \"universal\", database-independent, distortion-to-MOS mapping function. Preliminary experimental evaluation shows that the performance of the proposed system is comparable with ITU-T recommendation P.861 for clean speech sources, and outperforms the P.861 recommendation for speech sources corrupted by either car or babble noise at 30 dB SNR.","PeriodicalId":372569,"journal":{"name":"1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116560635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信