Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)最新文献_第3页

Isotropic noise modelling for nearfield array processing 近场阵列处理的各向同性噪声建模

Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452) Pub Date : 1999-10-17 DOI: 10.1109/ASPAA.1999.810837

T. Abhayapala, R. Kennedy, R. Williamson

引用次数: 2

New phase-vocoder techniques for pitch-shifting, harmonizing and other exotic effects 新的相位声码器技术，用于音高移动，和声和其他奇异效果

Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452) Pub Date : 1999-10-17 DOI: 10.1109/ASPAA.1999.810857

Jean Laroche, M. Dolson

引用次数: 110

Studies of a wideband stereophonic acoustic echo canceler 宽频带立体声回声消除器的研究

Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452) Pub Date : 1999-10-17 DOI: 10.1109/ASPAA.1999.810886

P. Eneroth, T. Gansler, S. Gay, J. Benesty

引用次数: 13

Application of the phase vocoder to pitch-preserving synchronization of an audio stream to an external clock 相位声码器在音频流与外部时钟保持音高同步中的应用

Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452) Pub Date : 1999-10-17 DOI: 10.1109/ASPAA.1999.810853

R. Sussman, J. Laroche

{"title":"Application of the phase vocoder to pitch-preserving synchronization of an audio stream to an external clock","authors":"R. Sussman, J. Laroche","doi":"10.1109/ASPAA.1999.810853","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810853","url":null,"abstract":"The phase vocoder is usually presented as a high-quality solution for time-scale modification of signals, Its main advantages versus the cheaper time-domain techniques include the high-quality of the output for a wide range of types of input signals (speech, music, noise), and the possibility to perform very large factor modifications (e.g., four-fold time-stretching or more). In this paper, we present two applications that require such extreme modification factors: we call the first one pitch-preserving audio scrubbing, in which a user can move a pointer along an audio track and hear the sound at the corresponding location without any pitch alteration. Because the user controls the playback location (and therefore the playback speed), and can very well stop at a given location, the required time-scale modification can involve a very large-factor. The second application consists of synchronizing an audio stream to a video stream, while avoiding pitch alteration. For extreme slow-motion playback, the time-scaling operation required to preserve the pitch can also involve a very large factor. We address theoretical and practical issues related to pitch-preserving synchronization of an audio track. Techniques are discussed to allow freezing time in the phase-vocoder and avoid problems associated with very large factor modifications.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113960115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

On some derivations of Gibson's approach for speech enhancement 吉布森语音增强方法的一些衍生

Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452) Pub Date : 1999-10-17 DOI: 10.1109/ASPAA.1999.810868

É. Grivel, M. Gabrea, M. Najim

引用次数: 0

A robustness analysis of 3D audio using loudspeakers 使用扬声器的3D音频鲁棒性分析

Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452) Pub Date : 1999-10-17 DOI: 10.1109/ASPAA.1999.810882

D. Ward, G. Elko

引用次数: 5

Multifeature audio segmentation for browsing and annotation 多功能音频分割浏览和注释

Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452) Pub Date : 1999-10-17 DOI: 10.1109/ASPAA.1999.810860

G. Tzanetakis, P. Cook

{"title":"Multifeature audio segmentation for browsing and annotation","authors":"G. Tzanetakis, P. Cook","doi":"10.1109/ASPAA.1999.810860","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810860","url":null,"abstract":"Indexing and content-based retrieval are necessary to handle the large amounts of audio and multimedia data that is becoming available on the Web and elsewhere. Since manual indexing using existing audio editors is extremely time consuming a number of automatic content analysis systems have been proposed. Most of these systems rely on speech recognition techniques to create text indices. On the other hand, very few systems have been proposed for automatic indexing of music and general audio. Typically these systems rely on classification and similarity-retrieval techniques and work in restricted audio domains. A somewhat different, more general approach for fast indexing of arbitrary audio data is the use of segmentation based on multiple temporal features combined with automatic or semi-automatic annotation. In this paper, a general methodology for audio segmentation is proposed. A number of experiments were performed to evaluate the proposed methodology and compare different segmentation schemes. Finally, a prototype audio browsing and annotation tool based on segmentation combined with existing classification techniques was implemented.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"315 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116532026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 152

A systematic hybrid analog/digital audio coder 一种系统混合模拟/数字音频编码器

Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452) Pub Date : 1999-10-17 DOI: 10.1109/ASPAA.1999.810843

R. Barron, A. Oppenheim

引用次数: 3

A speech feature based on Bark frequency warping-the non-uniform linear prediction (NLP) cepstrum 一种基于Bark频率翘曲的语音特征-非均匀线性预测倒谱

Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452) Pub Date : 1999-10-17 DOI: 10.1109/ASPAA.1999.810867

Yoon Kim, J.O. Smith

{"title":"A speech feature based on Bark frequency warping-the non-uniform linear prediction (NLP) cepstrum","authors":"Yoon Kim, J.O. Smith","doi":"10.1109/ASPAA.1999.810867","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810867","url":null,"abstract":"We propose a new method of obtaining features from speech signals for robust analysis and recognition-the non-uniform linear prediction (NLP) cepstrum. The objective is to derive a representation that suppresses speaker-dependent characteristics while preserving the linguistic quality of speech segments. The analysis is based on two principles. First, Bark frequency warping is performed on the LP spectrum to emulate the auditory spectrum. While widely used methods such as the mel-frequency and PLP analysis use the FFT spectrum as its basis for warping, the NLP analysis uses the LP-based vocal-tract spectrum with glottal effects removed. Second, all-pole modeling (LP) is used before and after the warping. The pre-warp LP is used to first obtain the vocal-tract spectrum, while the post-warp LP is performed to obtain a smoothed, two-peak model of the warped spectrum. Experiments were conducted to test the effectiveness of the proposed feature in the case of identification/discrimination of vowels uttered by multiple speakers using linear discriminant analysis (LDA), and frame-based vowel recognition with a statistical model. In both cases, the NLP analysis was shown to be an effective tool for speaker-independent speech analysis/recognition applications.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"57 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126000893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Improvements to the switched parametric and transform audio coder 改进的开关参数和变换音频编码器

Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452) Pub Date : 1999-10-17 DOI: 10.1109/ASPAA.1999.810845

S. Levine, J.O. Smith

引用次数: 5