2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421)最新文献

筛选
英文 中文
Analysis-by-synthesis voicing cut-off determination in harmonic coding 谐波编码中合成分析解调截止确定
Wenhui Jia, W. Chan
{"title":"Analysis-by-synthesis voicing cut-off determination in harmonic coding","authors":"Wenhui Jia, W. Chan","doi":"10.1109/SCFT.2000.878397","DOIUrl":"https://doi.org/10.1109/SCFT.2000.878397","url":null,"abstract":"In low bit-rate harmonic speech coding, voicing information is often specified by a cut-off frequency of the spectrum. Many approaches of cut-off estimation depend on spectral matching, where a fixed prototype spectrum is used to model voiced harmonics. However, voiced harmonics do not always show a regular shape. One of the causes is harmonic interference. We propose an analysis-by-synthesis voicing cut-off determination scheme that takes into account harmonic interactions in spectral matching. The proposed scheme has been embedded in a 2.4 kb/s harmonic coder. Subjective listening tests show that the scheme performs well and is robust against noise.","PeriodicalId":359453,"journal":{"name":"2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129941456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Markov chain prediction for missing speech frame compensation 缺失语音帧补偿的马尔可夫链预测
M. A. Kohler, R. Yarlagadda
{"title":"Markov chain prediction for missing speech frame compensation","authors":"M. A. Kohler, R. Yarlagadda","doi":"10.1109/SCFT.2000.878402","DOIUrl":"https://doi.org/10.1109/SCFT.2000.878402","url":null,"abstract":"Transmitting voice over packet-switched networks, such as the Internet, is an appealing communication alternative to the traditional wireline system. The ability to lower the cost of long-distance telephone calls and provide additional capabilities is attracting customers worldwide to this tool. However, many current packet-switched protocols cannot guarantee real-time delivery of packets. When voice packets are lost, deleted, or excessively delayed in the network, the receiver must provide something for the listener to hear. This paper describes Markov chain prediction, a technique for compensating when speech frames are missing. It outperforms venerable frame repetition using both subjective and objective measurements.","PeriodicalId":359453,"journal":{"name":"2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127802999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Model based spectrum prediction 基于模型的频谱预测
J. Lindblom, J. Samuelsson, Per Hedelin
{"title":"Model based spectrum prediction","authors":"J. Lindblom, J. Samuelsson, Per Hedelin","doi":"10.1109/SCFT.2000.878419","DOIUrl":"https://doi.org/10.1109/SCFT.2000.878419","url":null,"abstract":"This paper presents methods for speech spectrum prediction based on Gaussian mixture models. Spectrum prediction may be useful in a packet transmission system where the sensitivity to packet losses is a major problem. Models of speech are trained by the expectation maximization algorithm using pairs, triples etc. of consecutive cepstral vectors. The models are used to design first, second etc. order predictors. The prediction schemes are evaluated using the spectral distortion criterion and compared to a simple reference method. The best prediction scheme obtains an average spectral distortion that is 0.46 dB less than for the reference method.","PeriodicalId":359453,"journal":{"name":"2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128328853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Improved signal analysis and time-synchronous reconstruction in waveform interpolation coding 改进了波形插值编码中的信号分析和时间同步重构
N. Chong-White, Ian Burnett
{"title":"Improved signal analysis and time-synchronous reconstruction in waveform interpolation coding","authors":"N. Chong-White, Ian Burnett","doi":"10.1109/SCFT.2000.878394","DOIUrl":"https://doi.org/10.1109/SCFT.2000.878394","url":null,"abstract":"This paper presents a waveform-matched waveform interpolation (WMWI) technique which enables improved speech analysis over existing WI coders. In WMWI, an accurate representation of speech evolution is produced by extracting critically-sampled pitch periods of a time-warped, constant pitch residual. The technique also offers waveform-matching capabilities by using an inverse warping process to near-perfectly reconstruct the residual. Here, a pitch track optimisation technique is described which ensures the speech residual can be effectively decomposed and quantised. Also, the pitch parameters required to efficiently quantise and recreate the pitch track, on a period-by-period basis, are identified. This allows time-synchrony between the original and decoded signals to be preserved.","PeriodicalId":359453,"journal":{"name":"2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134171063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Results on reverse water-filling, SNR, and log-spectral error in codebook-based coding 基于码本的编码中反注水、信噪比和对数谱误差的研究结果
S. Voran
{"title":"Results on reverse water-filling, SNR, and log-spectral error in codebook-based coding","authors":"S. Voran","doi":"10.1109/SCFT.2000.878387","DOIUrl":"https://doi.org/10.1109/SCFT.2000.878387","url":null,"abstract":"This paper identifies optimum levels of reverse water-filling for codebook-based coding of noise and speech signals. We find that there is little to be gained from optimizing an effective rate parameter. We identify trade-offs between SNR and log-spectral error. We show that the use of a gain factor compares favorably with reverse water-filling in some situations.","PeriodicalId":359453,"journal":{"name":"2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421)","volume":"445 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133246903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Changes in voice quality judgments as a function of background noise level in the listening environment 语音质量判断的变化是听音环境中背景噪声水平的函数
L. Thorpe, R. Rabipour
{"title":"Changes in voice quality judgments as a function of background noise level in the listening environment","authors":"L. Thorpe, R. Rabipour","doi":"10.1109/SCFT.2000.878382","DOIUrl":"https://doi.org/10.1109/SCFT.2000.878382","url":null,"abstract":"This study explores the extent to which differences in voice quality with different bit rates become less perceptible when users are listening in a noisy environment. The individual rate modes of two multi-rate codecs were rated by listeners in various background noise conditions, including a quiet baseline, crowd babble, street noise, factory noise, and two levels of car noise. The results suggest that in some cases a lower bit-rate codec can be substituted without an associated drop in perceived quality when the listener is in a noisy location. Based on this effect, it would be possible to increase the system capacity or allow graceful handling of network overload by reducing transmission bandwidth allocated to receivers in high background noise without associated reduction in perceived voice quality.","PeriodicalId":359453,"journal":{"name":"2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116080017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On quantizer dimensions in joint speech/channel coding 语音/信道联合编码中量化器维度的研究
Tim Fingscheidt, T. Hindelang, V. Richard, Nambi Seshadri
{"title":"On quantizer dimensions in joint speech/channel coding","authors":"Tim Fingscheidt, T. Hindelang, V. Richard, Nambi Seshadri","doi":"10.1109/SCFT.2000.878404","DOIUrl":"https://doi.org/10.1109/SCFT.2000.878404","url":null,"abstract":"In mobile speech communication usually vector quantization (VQ) is employed to ensure high coding efficiency. Convolutional coding and softbit speech decoding can add a considerable amount of robustness. VQ as well as softbit decoding however can be a quite complex task. Under the constraint of a constant gross bit rate and clean channel quality we propose the use of lower dimensional VQ or even scalar quantization (SQ) with a higher bit rate which leaves then fewer redundancy to be added by channel coding. This concept of joint speech/channel coding with its suboptimal speech coder and weaker channel coder can efficiently employ softbit speech decoding yielding a low overall complexity transmission scheme. Cases are shown where its performance is even better as compared to a high dimensional VQ with softbit decoding.","PeriodicalId":359453,"journal":{"name":"2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127002961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Application of multidimensional scaling to subjective evaluation of coded speech 多维尺度在语音编码主观评价中的应用
J. L. Hall
{"title":"Application of multidimensional scaling to subjective evaluation of coded speech","authors":"J. L. Hall","doi":"10.1109/SCFT.2000.878380","DOIUrl":"https://doi.org/10.1109/SCFT.2000.878380","url":null,"abstract":"We propose a new procedure for subjective evaluation of coded speech. This procedure has the potential of providing an anchorable measure of quality that contains more information than the single number provided by MOS testing. A stimulus space and the relationship between this space and speech quality are established with multidimensional scaling techniques in a large-scale listening test. In the field, the user uses a method described in this report to position a stimulus under evaluation in this previously-established space, and from this position the user draws conclusions about speech quality. The stimulus space is created by the multidimensional scaling program INDSCAL, which operates on subjective judgments of dissimilarities between samples of speech to create a stimulus space in which distances between stimuli correspond to perceptual dissimilarities. The stimulus space has the additional property that its dimensions correspond to perceptual attributes of the stimuli. In a pilot experiment, stimulus spaces for utterances produced by a male and a female talker were found to be highly correlated. MOS scores obtained in a separate study were found to be highly correlated with position in the stimulus space. We discuss both the physical and perceptual correlates of the three dimensions.","PeriodicalId":359453,"journal":{"name":"2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128499370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
4 kb/s improved multi-pulse based CELP speech coding with multiple location codebook and post-processing 基于多位置码本和后处理的4kb /s改进多脉冲CELP语音编码
K. Ozawa
{"title":"4 kb/s improved multi-pulse based CELP speech coding with multiple location codebook and post-processing","authors":"K. Ozawa","doi":"10.1109/SCFT.2000.878379","DOIUrl":"https://doi.org/10.1109/SCFT.2000.878379","url":null,"abstract":"This paper proposes an improved MP-CELP (Multi-Pulse-based CELP) speech coding at 4 kb/s. In MP-CELP, amplitudes or signs of multi-pulse excitation are simultaneonsly vector quantized (VQ). In order to improve speech quality for voiced speech, a multiple pulse location codebook is stored to enhance the coverage of the location. The optimum combination among the pulse location codebook, pulse amplitude codevector and gain codevector is searched for and selected. In order to be robust against background noise, a post-processing efficiently reduces temporal fluctuation for the excitation signal. The subjective evaluation results show that speech quality for 4 kb/s improved MP-CELP is equivalent to that for ITU-T G.726 (32 kb/s) and G.729 (8 kb/s) for both M-IRS and flat clean speech. For background noise conditions, 4 kb/s speech quality is close to that for ITU-T G.726 (32 kb/s).","PeriodicalId":359453,"journal":{"name":"2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131217499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Very low rate speech coding using temporal decomposition and waveform interpolation 使用时间分解和波形插值的极低速率语音编码
C. Ritz, I. Burnett, J. Lukasiak
{"title":"Very low rate speech coding using temporal decomposition and waveform interpolation","authors":"C. Ritz, I. Burnett, J. Lukasiak","doi":"10.1109/SCFT.2000.878384","DOIUrl":"https://doi.org/10.1109/SCFT.2000.878384","url":null,"abstract":"In very low rate coding the aim is to accurately represent speech characteristics as efficiently as possible. High coding gains for the spectral features can be achieved through the use of temporal decomposition. Waveform interpolation coders accurately represent the excitation using characteristic waveforms (CWs) extracted at a constant rate. In this paper, the two approaches are combined into a very low rate coder operating at around 1 kbps. It is shown that the evolution of the excitation is related to the evolution of the speech spectrum. To minimise bit rates, the transmission of CWs is adapted to the spectral parameter evolution using the parameters derived from temporal decomposition of the spectral parameters.","PeriodicalId":359453,"journal":{"name":"2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123874742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信