Speech Coding, 2002, IEEE Workshop Proceedings.最新文献

筛选
英文 中文
MMSE decoding for vector quantization over channels with memory MMSE解码的矢量量化在信道与存储器
Speech Coding, 2002, IEEE Workshop Proceedings. Pub Date : 2002-10-09 DOI: 10.1109/SCW.2002.1215728
Heng-Iang Hsu, Wen-Whei Chang, Xiaobei Liu, S. Koh
{"title":"MMSE decoding for vector quantization over channels with memory","authors":"Heng-Iang Hsu, Wen-Whei Chang, Xiaobei Liu, S. Koh","doi":"10.1109/SCW.2002.1215728","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215728","url":null,"abstract":"The paper presents memory-enhanced extensions of minimum mean-squared error (MMSE) decoding for vector quantization over noisy channels. We also develop a recursive algorithm for computing the transition probabilities of the Gilbert channel, and illustrate its performance in vector quantization of Gauss-Markov sources under noisy channel conditions. Simulation results indicate that the proposed algorithm enables the implementation of an MMSE decoder with increased robustness to channel errors.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131395131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An iterative interpolative transform method for modeling harmonic magnitudes 调和幅值建模的迭代插值变换方法
Speech Coding, 2002, IEEE Workshop Proceedings. Pub Date : 2002-10-06 DOI: 10.1109/SCW.2002.1215716
T. Ramabadran, A. Smith, M. Jasiuk
{"title":"An iterative interpolative transform method for modeling harmonic magnitudes","authors":"T. Ramabadran, A. Smith, M. Jasiuk","doi":"10.1109/SCW.2002.1215716","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215716","url":null,"abstract":"In this paper, we describe a method for modeling speech harmonic magnitudes, the accurate representation of which is essential for high quality speech synthesis in several parametric vocoders. The given set of harmonic magnitudes is interpolated and transformed into the auto-correlation domain before an all-pole model is derived. Through an iterative procedure, the interpolation curve used in the frequency domain is improved. This new iterative, interpolative, transform (IIT) method has been found to model the harmonic magnitudes more accurately than earlier methods when measured in terms of log-spectral distortion.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123062564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Perceptual QoS assessment methodologies for coded speech in networks 网络中编码语音的感知QoS评价方法
Speech Coding, 2002, IEEE Workshop Proceedings. Pub Date : 2002-10-06 DOI: 10.1109/SCW.2002.1215730
N. Kitawaki
{"title":"Perceptual QoS assessment methodologies for coded speech in networks","authors":"N. Kitawaki","doi":"10.1109/SCW.2002.1215730","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215730","url":null,"abstract":"The paper reviews perceptual QoS assessment methodologies for coded speech in networks. The methods are mainly based on my contributions to the ITU-T Study Group 12 since 1981. First, quality factors in communications networks are analyzed, and then appropriate assessment methods for coded speech are discussed. Finally, the current status for perceptual QoS measurement methodologies is described from the viewpoint of a network planning tool for compound quality factors in communications networks.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124931303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A packet loss concealment method using pitch waveform repetition and internal state update on the decoded speech for the sub-band ADPCM wideband speech codec 针对子带ADPCM宽带语音编解码器,提出了一种基于基音波形重复和解码后语音内部状态更新的丢包隐藏方法
Speech Coding, 2002, IEEE Workshop Proceedings. Pub Date : 2002-10-06 DOI: 10.1109/SCW.2002.1215726
M. Serizawa, Y. Nozawa
{"title":"A packet loss concealment method using pitch waveform repetition and internal state update on the decoded speech for the sub-band ADPCM wideband speech codec","authors":"M. Serizawa, Y. Nozawa","doi":"10.1109/SCW.2002.1215726","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215726","url":null,"abstract":"The paper proposes a packet loss concealment (PLC) method for the SB-ADPCM (sub-band adaptive differential pulse code modulation) wideband speech codec. When a packet loss occurs, the concealment repeats a pitch waveform of the speech decoded in the past with attenuation to generate a speech waveform corresponding to the lost packet. The packet loss causes differences in the internal states, such as prediction filter states, between encoding and decoding of the SB-ADPCM codec. This difference results in an annoying click noise during the period following the packet loss. The proposed method reduces this difference by updating the internal state based on the speech decoded by the concealment in the past. It also employs a forgetting factor control for the internal states, which reduces the impact on the internal states from the packet loss. Results from a five-grade mean opinion test show that the proposed method achieves around 3 (fair) or 4 (good) speech quality at a loss rate lower than 5%, and 0.4 through 1.0 higher quality compared to the conventional muting PLC method at packet loss rates of 1 to 10% with a packet size of 10 or 20 msec.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115571987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
The analysis of speech codecs using psychoacoustic measures 语音编解码器的心理声学分析
Speech Coding, 2002, IEEE Workshop Proceedings. Pub Date : 2002-10-06 DOI: 10.1109/SCW.2002.1215740
Mohammed Raad, C. Ritz, I. Burnett, A. Mertins
{"title":"The analysis of speech codecs using psychoacoustic measures","authors":"Mohammed Raad, C. Ritz, I. Burnett, A. Mertins","doi":"10.1109/SCW.2002.1215740","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215740","url":null,"abstract":"This paper analyses two narrowband speech codecs, the 4.8 kbit/s FS1016 coder and the 8 kbit/s G729 coder, using objective psychoacoustic measures. Four measures are used: loudness, sharpness, roughness and tonality. The results show sharpness and roughness as the two major contributing factors to the subjective difference between the two coders.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129459525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Wideband speech coder employing T-codes and reversible variable length codes 采用t码和可逆变长码的宽带语音编码器
Speech Coding, 2002, IEEE Workshop Proceedings. Pub Date : 2002-10-06 DOI: 10.1109/SCW.2002.1215743
Hongqiang Wang, S. Koh, G. Shu
{"title":"Wideband speech coder employing T-codes and reversible variable length codes","authors":"Hongqiang Wang, S. Koh, G. Shu","doi":"10.1109/SCW.2002.1215743","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215743","url":null,"abstract":"The performance of speech coders, such as the ITU-T G.722.1 wideband speech coder, that employ nonself-synchronizing variable length codes is greatly affected when the received bit stream is in error. This paper studies the use of T-codes and reversible variable length codes (RVLC) to replace the Huffman codes recommended in the G.722.1 coder in order to improve its robustness when bit errors occur. Preliminary simulation results show significant improvement in coder performance with the proposed schemes.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121163151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speech and noise separations using comb filtering method for high quality speech coding 语音和噪声分离采用梳状滤波方法进行高质量语音编码
Speech Coding, 2002, IEEE Workshop Proceedings. Pub Date : 2002-10-06 DOI: 10.1109/SCW.2002.1215739
Y. Wang, K. Yoshida
{"title":"Speech and noise separations using comb filtering method for high quality speech coding","authors":"Y. Wang, K. Yoshida","doi":"10.1109/SCW.2002.1215739","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215739","url":null,"abstract":"This paper presents speech and a noise separation methods for achieving high quality of speech coding. In speech separation, a pitch harmonics restoration method is proposed. This method can effectively suppress the so-called musical noise and reduce speech distortion. In noise separation, the noise-base estimated in the speech separation process is used as a separated background noise and a method for encoding the noise with low bit rates is proposed. The proposed methods used as a preprocessor of the adaptive multi-rate wideband (AMR-WB) coder are evaluated by the degradation category rating (DCR) test. An average of 0.3-point improvement in performance under the noise conditions is achieved compared with the conventional method without using speech and noise separations.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127941579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Quantization noise spectral shaping in instantaneous coding of spectrally unbalanced speech signals 频谱不平衡语音信号瞬时编码中的量化噪声频谱整形
Speech Coding, 2002, IEEE Workshop Proceedings. Pub Date : 2002-10-06 DOI: 10.1109/SCW.2002.1215722
G. Mahé, A. Gilloire
{"title":"Quantization noise spectral shaping in instantaneous coding of spectrally unbalanced speech signals","authors":"G. Mahé, A. Gilloire","doi":"10.1109/SCW.2002.1215722","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215722","url":null,"abstract":"In the context of centralized spectral equalization of speech in a telephone network, the signal is spectrally strongly unbalanced at the output of the equalizer, before being quantized, which results in low SNR at the receiver. We propose and evaluate experimentally two methods to reshape the quantization noise, in order to make it less perceptible in reception. The first one consists in finding the most probable quantization sequence, given the desired noise spectrum. In the second one, the filtered quantization error is added to the signal to be quantized.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125533580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A scalable coder designed for 10-kHz bandwidth speech 为10khz带宽语音设计的可扩展编码器
Speech Coding, 2002, IEEE Workshop Proceedings. Pub Date : 2002-10-06 DOI: 10.1109/SCW.2002.1215741
M. Oshikiri, H. Ehara, K. Yoshida
{"title":"A scalable coder designed for 10-kHz bandwidth speech","authors":"M. Oshikiri, H. Ehara, K. Yoshida","doi":"10.1109/SCW.2002.1215741","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215741","url":null,"abstract":"This paper presents a scalable speech coder with rate of 23.85-kbit/s to encode 10-kHz bandwidth speech signals. The perceptual quality of the 10-kHz bandwidth speech signals is much better than that of 7-kHz bandwidth ones, and it is close to that of 20-kHz bandwidth ones. The 10-kHz bandwidth is therefore promising for high-fidelity conversational applications. The scalable coder consists of two layers: a base-layer and an enhancement-layer. The adaptive multi-rate wideband speech coder (AMR-WB) at 15.85-kbit/s and a transform coding method at 8-kbit/s are utilized for the base-layer and the enhancement-layer, respectively. This hybrid structure ensures the efficient coding of the 10-kHz bandwidth speech. In enhancement-layer, the modified discrete cosine transform (MDCT) is exploited. Its analysis frame size is set to be short in order to minimize additional algorithmic delay. The total additional algorithmic delay of the enhancement-layer is 5-ms. Since it is difficult to quantize all the MDCT coefficients at 8-kbit/s, we have limited the region for quantization from 6-kHz to 9-kHz to improve the perceptual quality of decoded speech. Our subjective evaluation test results indicate the quality of the proposed coder clearly exceeds that of AMR-WB at 23.85-kbit/s under both clean and noise conditions.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121478822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A 1200/2400 bps coding suite based on MELP 基于MELP的1200/2400 bps编码套件
Speech Coding, 2002, IEEE Workshop Proceedings. Pub Date : 2002-10-06 DOI: 10.1109/SCW.2002.1215734
Tian Wang, K. Koishida, V. Cuperman, A. Gersho, J. Collura
{"title":"A 1200/2400 bps coding suite based on MELP","authors":"Tian Wang, K. Koishida, V. Cuperman, A. Gersho, J. Collura","doi":"10.1109/SCW.2002.1215734","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215734","url":null,"abstract":"This paper presents key algorithm features of the future NATO narrow band voice coder (NBVC), a 1.2/2.4 kbps speech coder with noise preprocessor based on the MELP analysis algorithm. At 1.2 kbps, the MELP parameters for three consecutive frames are grouped into a superframe and jointly quantized to obtain high coding efficiency. The inter-frame redundancy is exploited with distinct quantization schemes for different unvoiced/voiced (U/V) frame combinations in the superframe. Novel techniques used at 1.2 kbps include pitch vector quantization using pitch differentials, joint quantization of pitch and U/V decisions and LSF quantization with a forward-backward interpolation method. A new harmonic synthesizer is introduced for both rates which improves the reproduction quality. Subjective test results indicate that the 1.2 kbps speech coder achieves quality close to the existing federal standard 2.4 kbps MELP coder.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131114522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信