{"title":"A noise suppresser for the AMR speech codec and evaluation test results based on 3GPP specifications","authors":"S. Furuta, S. Takahashi","doi":"10.1109/SCW.2002.1215757","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215757","url":null,"abstract":"We propose a noise suppresser (AMR-NS) for the 3GPP standard AMR (adaptive multirate) speech codec. The proposed AMR-NS method is based on spectral subtraction and is structured from spectral amplitude suppression and spectral subtraction that are adaptively controlled by the input signal. The subjective evaluation results based on 3GPP performance requirements (3GPP TS26.077) in a third party testing laboratory show that the proposed AMR-NS has satisfied all the requirements for Japanese and English speech materials. It has also satisfied all the objective evaluation requirements.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115929852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Coding unconstrained FCB excitation using combinatorial and Huffman codes","authors":"U. Mittal, J. P. Ashley, E. M. Cruz-Zeno","doi":"10.1109/SCW.2002.1215747","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215747","url":null,"abstract":"A method for coding \"unconstrained\" fixed codebook (FCB) excitation for ACELP speech coders is proposed. The unconstrained FCB does not place track-based constraint on the pulse positions. The coding method combines Huffman codes and combinatorial codes. The method is less sensitive to bit errors and is nearly as efficient as the combinatorial codes. A method for efficiently storing the parameters needed in the combinatorial codebook is also proposed.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124252193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Host laboratory role in the selection of the future NATO narrow band voice coder","authors":"M. Street","doi":"10.1109/SCW.2002.1215735","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215735","url":null,"abstract":"This paper describes the role and responsibilities of the host laboratory in the multi-national test and selection process for the future NATO narrow band voice coder standard. The selection was made from a number of implementations of narrow band voice coders submitted by NATO member nations. Voice coders were installed on a voice processing workstation at the host laboratory in fixed and floating point forms, together with a number of reference coders. The voice coders were then comprehensively tested in a wide range of noise environments and conditions which were representative of military scenarios. Over 500 hours of processed speech was generated and analysed during these tests.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126448717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Subspace-based speech enhancement with rank-deficient prewhitening","authors":"S.H. Jensen, J.P. Kargo, C. Rødbro, K. Sørensen","doi":"10.1109/SCW.2002.1215760","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215760","url":null,"abstract":"We study a new subspace-based noise reduction technique that can handle the case of narrowband noise, i.e. the case where the noise covariance matrix is rank-deficient. The formulation of the technique is based on the quotient singular value decomposition (QSVD) and is a generalization of existing techniques that can handle only broadband noise. We also show by examples that we are able to achieve a satisfactory noise reduction result.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121502277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AMR and AMR-WB RTP payload usage in packet switched conversational multimedia services","authors":"A. Lakaniemi, P. Ojala, H. Toukomaa","doi":"10.1109/SCW.2002.1215753","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215753","url":null,"abstract":"The RTP payload format for AMR and AMR-WB speech codecs has recently been approved in IETF. The new payload contains several functionalities enabling different methods transmitting speech parameters over packet switched network in both wired and wireless environment in an error robust manner. This paper gives a description of the functionalities and benefits of the new RTP payload. In addition, the performance of the RTP payload enhancements is evaluated when IP packets are transmitted over erroneous radio channel in WCDMA.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129092864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The effect of source based rate adaptation extension in AMR-WB speech codec","authors":"J. Makinen, P. Ojala, J. Vainio","doi":"10.1109/SCW.2002.1215755","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215755","url":null,"abstract":"This paper presents a source based rate adaptation concept for AMR wideband speech codec. The source based rate adaptation algorithm selects the multi rate codec mode based on the input speech characteristics and coding parameters to minimise the average bit rate. The presented concept introduces up to 50% reduction in average bit rate without any degradation in speech quality. The benefit of source based adaptation is in increasing the system capacity in conversational services as well as storage size in messaging type of applications.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134148856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Salami, B. Bessette, R. Lefebvre, M. Jelinek, J. Rotola-Pukkila, J. Vainio, H. Mikkola, K. Jarvinen
{"title":"The adaptive multi-rate wideband codec: history and performance","authors":"R. Salami, B. Bessette, R. Lefebvre, M. Jelinek, J. Rotola-Pukkila, J. Vainio, H. Mikkola, K. Jarvinen","doi":"10.1109/SCW.2002.1215752","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215752","url":null,"abstract":"This paper gives the history and performance of the adaptive multi-rate wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services. The AMR-WB speech codec algorithm was selected in December 2000, and the corresponding specifications were approved in March 2001. In July 2001, the AMR-WB codec was also selected by ITU-T in the standardization activity for wideband speech coding around 16 kbit/s. The adoption of AMR-WB by ITU-T is of significant importance since for the first time the same codec is adopted for wireless as well as wireline services. AMR-WB uses an extended audio bandwidth from 3.4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to 2/sup nd/ and 3/sup rd/ generation mobile communication systems.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133280199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A wideband noise suppressor for the AMR wideband speech codec","authors":"M. Kato, A. Sugiyama, M. Serizawa","doi":"10.1109/SCW.2002.1215756","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215756","url":null,"abstract":"A wideband noise suppressor for the AMR (adaptive multi-rate) wideband speech codec is proposed. The wideband noise suppressor features weighted noise estimation for an accurate noise estimate, pseudo noise injection for more suitable spectral gain, and synthesis windowing for smooth transition at frame boundaries. In the subjective evaluation with the AMR wideband speech codec, the proposed noise suppressor satisfies all eighteen provisional requirements, which was originally standardized for AMR narrowband noise suppressor, in absolute category rating, ten out of twelve provisional requirements in comparison category rating (CCR), respectively. Although it does not meet two requirements in CCR, its basic performance suggests that the proposed wideband noise suppressor is most likely to meet all requirements by the evaluation with 24 listeners specified in the test plan for AMR narrowband noise suppressor.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"518 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133697775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DSP multi-channel audio decoding","authors":"G. Wineinger","doi":"10.1109/SCW.2002.1215758","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215758","url":null,"abstract":"Summary form only given. The multitude of audio coding schemes will continue to increase, as more and diverse ways are found to get closer to real 3D audio. The challenges in keeping up with new formats are best handled by programmable DSPs; however, standardization or pseudstandardization yields to more cost effective approaches. Programmable DSPs give a faster time to market and have lower engineering costs than patching or re-developing a custom VLSI design. Once the decoding algorithm has become an industry standard, custom or ASIC solutions usually yield a more cost effective approach. We discuss some of these approaches and their advantages and disadvantages and the reasons to choose one over the other. Special purpose hardware often consumes much less power, but other factors to be considered are the required interfaces and whether they are likely to change. Deciding which option to pursue and researching and understanding the objectives are also key factors to a successful solution. We must never underestimate the design time, effort, and cost to be incurred in completing the full custom, ASIC development, or programmable DSP. A robust solution is necessary to compete. The issues that surround audio decoding solution alternatives require to be discussed.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123027118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pitch synchronous split-band LPC (PS-SBLPC) vocoder","authors":"C. Sturt, S. Villette, A. Kondoz","doi":"10.1109/SCW.2002.1215748","DOIUrl":"https://doi.org/10.1109/SCW.2002.1215748","url":null,"abstract":"A pitch-synchronous split band LPC (PS-SBLPC) speech coder is proposed. In this new paradigm, harmonic analysis is carried out on individual pitch cycle waveforms (PCWs) rather than using a large window. PCWs are identified using a trapezoidal window search performed on a modified time envelope signal. In order to achieve a fixed rate coder the PCW parameters are jointly quantised using a combined interpolation and quantisation routine. Combining interpolation and quantisation allows for high correlation between successive PCWs to be exploited, without subjecting rapid transitions to time smoothing. During speech synthesis, no interpolation is applied, as parameter smoothing is provided during the quantisation. Simulation results comparing the PS-SBLPC model with the SB-LPC model show that the quality of the PS-SBLPC speech signal is significantly better than that of the split band LPC (SB-LPC). Initial results have shown that quantisation optimisation leads to vast improvements in speech quality during speech transitions.","PeriodicalId":140750,"journal":{"name":"Speech Coding, 2002, IEEE Workshop Proceedings.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114169192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}