Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)最新文献

筛选
英文 中文
Isotropic noise modelling for nearfield array processing 近场阵列处理的各向同性噪声建模
T. Abhayapala, R. Kennedy, R. Williamson
{"title":"Isotropic noise modelling for nearfield array processing","authors":"T. Abhayapala, R. Kennedy, R. Williamson","doi":"10.1109/ASPAA.1999.810837","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810837","url":null,"abstract":"An exact series representation for a nearfield spherically isotropic noise model is introduced. The methodology uses the spherical harmonics expansion of the wavefield at a sensor to obtain the correlation between two sensors due to the nearfield isotropic noise field. The result is useful in nearfield application of sensor arrays. The proposed noise model can be utilized effectively to apply well established farfield array processing algorithms for nearfield applications. Specifically, any signal processing criterion based on farfield isotropic noise correlation can be reformulated with nearfield noise with this representation. A simple array gain optimization is used to demonstrate the new noise model.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123678279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
New phase-vocoder techniques for pitch-shifting, harmonizing and other exotic effects 新的相位声码器技术,用于音高移动,和声和其他奇异效果
Jean Laroche, M. Dolson
{"title":"New phase-vocoder techniques for pitch-shifting, harmonizing and other exotic effects","authors":"Jean Laroche, M. Dolson","doi":"10.1109/ASPAA.1999.810857","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810857","url":null,"abstract":"The phase-vocoder is usually presented as a high-quality solution for time-scale modification of signals, pitch-scale modifications usually being implemented as a combination of timescaling and sampling rate conversion. We present two new phase-vocoder-based techniques which allow direct manipulation of the signal in the frequency-domain, enabling such applications as pitch-shifting, chorusing, harmonizing, partial stretching and other exotic modifications which cannot be achieved by the standard time-scale sampling-rate conversion scheme. The new techniques are based on a very simple peak-detection stage, followed by a peak-shifting stage. The very simplest one allows for 50% overlap but restricts the precision of the modifications, while the most flexible techniques requires a more expensive 75% overlap.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131839212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 110
Studies of a wideband stereophonic acoustic echo canceler 宽频带立体声回声消除器的研究
P. Eneroth, T. Gansler, S. Gay, J. Benesty
{"title":"Studies of a wideband stereophonic acoustic echo canceler","authors":"P. Eneroth, T. Gansler, S. Gay, J. Benesty","doi":"10.1109/ASPAA.1999.810886","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810886","url":null,"abstract":"In this paper a wideband stereophonic acoustic echo canceler is presented. The fundamental difficulty of stereophonic acoustic echo cancellation (SAEC) is described and an echo canceler based on a fast recursive least squares algorithm in a subband structure is proposed. This structure have been used in a real-time implementation, on which experiments have been performed. In the paper, simulation results of this implementation on real life recordings, with 8 kHz bandwidth, are studied. The results clearly verify that the theoretic fundamental problem of SAEC also applies in real-life situations. They also show that more sophisticated adaptive algorithms are needed in the lower frequency regions than in the higher regions.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130094008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Application of the phase vocoder to pitch-preserving synchronization of an audio stream to an external clock 相位声码器在音频流与外部时钟保持音高同步中的应用
R. Sussman, J. Laroche
{"title":"Application of the phase vocoder to pitch-preserving synchronization of an audio stream to an external clock","authors":"R. Sussman, J. Laroche","doi":"10.1109/ASPAA.1999.810853","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810853","url":null,"abstract":"The phase vocoder is usually presented as a high-quality solution for time-scale modification of signals, Its main advantages versus the cheaper time-domain techniques include the high-quality of the output for a wide range of types of input signals (speech, music, noise), and the possibility to perform very large factor modifications (e.g., four-fold time-stretching or more). In this paper, we present two applications that require such extreme modification factors: we call the first one pitch-preserving audio scrubbing, in which a user can move a pointer along an audio track and hear the sound at the corresponding location without any pitch alteration. Because the user controls the playback location (and therefore the playback speed), and can very well stop at a given location, the required time-scale modification can involve a very large-factor. The second application consists of synchronizing an audio stream to a video stream, while avoiding pitch alteration. For extreme slow-motion playback, the time-scaling operation required to preserve the pitch can also involve a very large factor. We address theoretical and practical issues related to pitch-preserving synchronization of an audio track. Techniques are discussed to allow freezing time in the phase-vocoder and avoid problems associated with very large factor modifications.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113960115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
On some derivations of Gibson's approach for speech enhancement 吉布森语音增强方法的一些衍生
É. Grivel, M. Gabrea, M. Najim
{"title":"On some derivations of Gibson's approach for speech enhancement","authors":"É. Grivel, M. Gabrea, M. Najim","doi":"10.1109/ASPAA.1999.810868","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810868","url":null,"abstract":"This paper deals with a Kalman filter-based enhancement of a speech signal embedded in a colored noise, when using a single microphone system. Several approaches using Kalman filtering have been developed. More particularly, Gibson et al. (1991) reported an iterative method based on the so called \"noise-free\" state space model, which may imply the introduction of a coordinate transformation to perform Kalman filtering. The authors do not address the identification issue. We propose some derivations of this method through an identification step using subspace methods for identification, previously developed in the field of control by Van Overschee (1993). The methods proposed here are then compared with other Kalman based-approaches.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132358238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A robustness analysis of 3D audio using loudspeakers 使用扬声器的3D音频鲁棒性分析
D. Ward, G. Elko
{"title":"A robustness analysis of 3D audio using loudspeakers","authors":"D. Ward, G. Elko","doi":"10.1109/ASPAA.1999.810882","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810882","url":null,"abstract":"It is well known that the effectiveness of 3D audio systems is critically dependent on the listener's head being in a known location. In this paper we analyze the fundamental role played by the loudspeaker positions in determining the robustness of the crosstalk canceler. Based on an extremely simple head model, we derive straightforward expressions for the loudspeaker positions that optimize the system robustness, which is measured by matrix condition numbers. These derived optimum positions are then compared with empirically-derived optimum positions obtained from actual HRTF (head related transfer function) measurements. The results indicate that our analytical expressions accurately predict the optimum loudspeaker positions.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"180 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129340436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Multifeature audio segmentation for browsing and annotation 多功能音频分割浏览和注释
G. Tzanetakis, P. Cook
{"title":"Multifeature audio segmentation for browsing and annotation","authors":"G. Tzanetakis, P. Cook","doi":"10.1109/ASPAA.1999.810860","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810860","url":null,"abstract":"Indexing and content-based retrieval are necessary to handle the large amounts of audio and multimedia data that is becoming available on the Web and elsewhere. Since manual indexing using existing audio editors is extremely time consuming a number of automatic content analysis systems have been proposed. Most of these systems rely on speech recognition techniques to create text indices. On the other hand, very few systems have been proposed for automatic indexing of music and general audio. Typically these systems rely on classification and similarity-retrieval techniques and work in restricted audio domains. A somewhat different, more general approach for fast indexing of arbitrary audio data is the use of segmentation based on multiple temporal features combined with automatic or semi-automatic annotation. In this paper, a general methodology for audio segmentation is proposed. A number of experiments were performed to evaluate the proposed methodology and compare different segmentation schemes. Finally, a prototype audio browsing and annotation tool based on segmentation combined with existing classification techniques was implemented.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"315 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116532026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 152
A systematic hybrid analog/digital audio coder 一种系统混合模拟/数字音频编码器
R. Barron, A. Oppenheim
{"title":"A systematic hybrid analog/digital audio coder","authors":"R. Barron, A. Oppenheim","doi":"10.1109/ASPAA.1999.810843","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810843","url":null,"abstract":"This paper describes a signal coding solution for a hybrid channel that is the composition of two channels: a noisy analog channel through which a signal source is sent unprocessed and a secondary rate-constrained digital channel. The source is processed prior to transmission through the digital channel. Signal coding solutions for this hybrid channel are clearly applicable to the in-band on-channel (IBOC) digital audio broadcast (DAB) problem. We present the design of a perceptually-based subband audio coder, with complexity comparable to conventional coders, that exploits a signal at the receiver of the form y[n]=g[n]*x[n]+u[n], where x[n], g[n], and u[n] denote respectively the source, the impulse response of convolutional distortion, and additive Gaussian noise. Concepts from conventional subband coding, e.g. subband decomposition, quantization, bit allocation, and lossless signal coding, are tailored to exploit the analog signal at the receiver such that frequency-weighted mean-squared error is minimized.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121938635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A speech feature based on Bark frequency warping-the non-uniform linear prediction (NLP) cepstrum 一种基于Bark频率翘曲的语音特征-非均匀线性预测倒谱
Yoon Kim, J.O. Smith
{"title":"A speech feature based on Bark frequency warping-the non-uniform linear prediction (NLP) cepstrum","authors":"Yoon Kim, J.O. Smith","doi":"10.1109/ASPAA.1999.810867","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810867","url":null,"abstract":"We propose a new method of obtaining features from speech signals for robust analysis and recognition-the non-uniform linear prediction (NLP) cepstrum. The objective is to derive a representation that suppresses speaker-dependent characteristics while preserving the linguistic quality of speech segments. The analysis is based on two principles. First, Bark frequency warping is performed on the LP spectrum to emulate the auditory spectrum. While widely used methods such as the mel-frequency and PLP analysis use the FFT spectrum as its basis for warping, the NLP analysis uses the LP-based vocal-tract spectrum with glottal effects removed. Second, all-pole modeling (LP) is used before and after the warping. The pre-warp LP is used to first obtain the vocal-tract spectrum, while the post-warp LP is performed to obtain a smoothed, two-peak model of the warped spectrum. Experiments were conducted to test the effectiveness of the proposed feature in the case of identification/discrimination of vowels uttered by multiple speakers using linear discriminant analysis (LDA), and frame-based vowel recognition with a statistical model. In both cases, the NLP analysis was shown to be an effective tool for speaker-independent speech analysis/recognition applications.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"57 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126000893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Improvements to the switched parametric and transform audio coder 改进的开关参数和变换音频编码器
S. Levine, J.O. Smith
{"title":"Improvements to the switched parametric and transform audio coder","authors":"S. Levine, J.O. Smith","doi":"10.1109/ASPAA.1999.810845","DOIUrl":"https://doi.org/10.1109/ASPAA.1999.810845","url":null,"abstract":"We introduce improvements to previous sines+transients+noise audio modeling systems, including new sinusoidal trajectory selection and quantization procedures. In a previous work by Levine and Smith (see Proc. Int. Conf. Acoustics, Speech, and Signal Processing, Phoenix, 1999), the audio is first segmented into transient and non-transient regions. The transient region is modeled using traditional transform coding techniques, while the non-transient regions are modeled using parametric sines plus noise modeling. Because such a system contains a mix of parametric and non-parametric techniques, compressed-domain processing such as time-scale modifications are possible.","PeriodicalId":229733,"journal":{"name":"Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129187925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信