2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics最新文献

筛选
英文 中文
Rate-distortion optimization for multichannel audio compression 多通道音频压缩的率失真优化
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701839
Minyue Li, J. Skoglund, W. Kleijn
{"title":"Rate-distortion optimization for multichannel audio compression","authors":"Minyue Li, J. Skoglund, W. Kleijn","doi":"10.1109/WASPAA.2013.6701839","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701839","url":null,"abstract":"Multichannel audio coding is studied from a rate-distortion theoretical viewpoint. Two practical coding techniques, both of which are based on rate-distortion optimization, are also proposed. The first technique decorrelates a multichannel signal hierarchically using elementary unitary transforms. The second method rearranges a multichannel signal into sub-signals and compresses them at optimized bit rates using a conventional codec. Both objective and subjective tests were conducted to illustrate the efficiency of the methods.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133465472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust DOA estimation of speech signals via sparsity models using microphone arrays 基于稀疏度模型的语音信号鲁棒DOA估计
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701823
Eleonora Cagli, Diego Carrera, G. Aletti, G. Naldi, B. Rossi
{"title":"Robust DOA estimation of speech signals via sparsity models using microphone arrays","authors":"Eleonora Cagli, Diego Carrera, G. Aletti, G. Naldi, B. Rossi","doi":"10.1109/WASPAA.2013.6701823","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701823","url":null,"abstract":"Direction-of-arrival (DOA) estimation of speech signals using a set of spatially separated microphones in an array is a problem arising in many practical applications. Examples include human computer interfaces, automatic camera-steering systems for multipartecipant videoconferencing, and tracking systems in smart home environments. This paper introduces a robust method for speech signals localization which makes use of sparsity models for signal representation, and includes an analysis of the denoising problem for realistic applications using MEMS microphone arrays. Experimental results on both synthetic and real speech data show that the proposed method is noise-robust and provides high reliable localization performances even in case of multiple sources and small number of microphones.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115845915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Perceptual Cepstral filters for speech and music processing 用于语音和音乐处理的感知倒谱滤波器
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701858
R. Mignot, V. Välimäki
{"title":"Perceptual Cepstral filters for speech and music processing","authors":"R. Mignot, V. Välimäki","doi":"10.1109/WASPAA.2013.6701858","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701858","url":null,"abstract":"Source-filter modeling of speech or musical tones requires a filter model for the spectral envelope of the signal. To reduce the number of modeling parameters, one idea is the use of psychoacoustic knowledge to encode only the relevant information in a perceptual sense. Starting from an accurate estimation of the original spectral envelope, with imperceptible details, in this work, we propose to use its Mel-Frequency Cepstral Coefficient (MFCC) representation to catch the perceptually relevant information. Then, a new inverse process is presented to derive a smoother, but perceptually equivalent spectral envelope. For instance, this new method can be applied in speech coding, and thanks to the good properties of the MFCC representation, perceptual interpolations of sounds is made easier.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126990325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
MINTFormer: A spatially aware channel equalizer MINTFormer:一个空间感知通道均衡器
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701881
Felicia Lim, Mark R. P. Thomas, P. Naylor
{"title":"MINTFormer: A spatially aware channel equalizer","authors":"Felicia Lim, Mark R. P. Thomas, P. Naylor","doi":"10.1109/WASPAA.2013.6701881","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701881","url":null,"abstract":"Reverberation is a process that distorts a wanted signal and impairs perceived speech quality. In the context of multichannel dereverberation, channel-based methods and beamforming are two common approaches. Channel-based methods such as the multiple input/output inverse theorem (MINT) can provide perfect dereverberation provided the exact acoustic impulse responses (AIRs) are known. However, they have been shown to be very sensitive to AIR estimation errors for which several modifications have consequently been proposed. Conversely, beamformers are significantly more robust but provide comparatively modest dereverberation. While the two approaches are conventionally considered independent, both can be formulated as a filter-and-sum operation with differing filter design criteria. We propose a unified framework, termed MINT-Forming, that exploits this similarity and introduces a mixing parameter to control the tradeoff between the potential performance of MINT and the robustness of beamforming. Empirical results show that the mixing parameter is a monotonic function of channel estimation error, whereby a MINT solution is preferred when channel estimation error is low.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124812059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Frequency domain multi-channel expectation maximization algorithm for audio background noise reduction 音频背景噪声降噪的频域多通道期望最大化算法
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701859
Jichi Deng, S. Godsill
{"title":"Frequency domain multi-channel expectation maximization algorithm for audio background noise reduction","authors":"Jichi Deng, S. Godsill","doi":"10.1109/WASPAA.2013.6701859","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701859","url":null,"abstract":"In this paper we implement expectation maximization (EM) based methods in the short time Fourier transform (STFT) domain for background noise reduction in multi-channel systems. The models introduce a Wishart prior for the unknown signal covariance matrix. An EM algorithm is used to maximise the posterior probability for the clean signal, approaching a stationary point of the distribution with increasing iterations. The background noise is modelled as white and stationary in this initial work. The proposed methods are found to outperform a multi-channel Wiener filter in terms of residual noise artefacts and MSE for a small initial trial.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121679005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gaussian process data fusion for heterogeneous HRTF datasets 异构HRTF数据集的高斯过程数据融合
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701842
Yuancheng Luo, D. Zotkin, R. Duraiswami
{"title":"Gaussian process data fusion for heterogeneous HRTF datasets","authors":"Yuancheng Luo, D. Zotkin, R. Duraiswami","doi":"10.1109/WASPAA.2013.6701842","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701842","url":null,"abstract":"Head-Related Transfer Function (HRTF) measurement and extraction are important tasks for personalized-spatial audio. Many laboratories have their own apparatuses for data-collection but few studies have compared their results to a common subject or have modeled inter-dataset variances. We present a Bayesian fusion method based on Gaussian process (GP) modeling of joint spatial-frequency HRTFs over different spherical-measurement grids. Neumann KU-100 dummy HRTFs from 7 labs in the “Club Fritz” study are compared and fused to each other based on learning a set of transformations from the GP data-likelihood and covariance assumptions; parameter and hyperparameter training is automatic. Experimental results show that fused models for horizontal and median-plane HRTFs generalize the datasets better than pre-transformed ones.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128360816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A new clustering approach for solving the permutation problem in convolutive blind source separation 一种解决卷积盲源分离中排列问题的聚类新方法
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701852
Radoslaw Mazur, J. Jungmann, A. Mertins
{"title":"A new clustering approach for solving the permutation problem in convolutive blind source separation","authors":"Radoslaw Mazur, J. Jungmann, A. Mertins","doi":"10.1109/WASPAA.2013.6701852","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701852","url":null,"abstract":"In this paper we propose a new clustering approach for solving the permutation ambiguity in convolutive blind source separation. After the transformation to the time-frequency domain, the problem of separation of sources can be reduced to multiple instantaneous problems, which may be solved using independent component analysis. The drawbacks of this approach are the inherent permutation and scaling ambiguities, which have to be corrected before the transformation to the time domain. Here, we propose a new method that allows for aligning up to several hundreds of consecutive bins into clusters. The depermutation of these clusters using some known techniques is then much easier than the original problem. The performance of the proposed method is evaluated on real-room recordings.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133960881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Room impulse response synthesis based on a 2D multi-plane FDTD hybrid acoustic model 基于二维多平面时域有限差分混合声学模型的房间脉冲响应合成
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701887
Stephen Oxnard, D. Murphy
{"title":"Room impulse response synthesis based on a 2D multi-plane FDTD hybrid acoustic model","authors":"Stephen Oxnard, D. Murphy","doi":"10.1109/WASPAA.2013.6701887","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701887","url":null,"abstract":"This paper exposes, and analyzes the validity of, a novel hybrid acoustic modeling system created through complementary assimilation of 3D geometric and 2D numerical modeling techniques. It is demonstrated that multiple 2D Finite Difference Time Domain schemes may be employed to simulate low-frequency sound wave propagation throughout a simplistic 3D enclosure, thus avoiding the immense computational challenges posed by 3D numerical approaches. Band limited room impulse responses (RIRs) generated in this way may be appropriately calibrated and combined with high-frequency results obtained from well-established geometric modeling methods to realize efficient, yet accurate hybrid RIR synthesis. Objective results show that the low-frequency 2D multiplane solution yields comparable accuracy to that gained through 3D simulation while achieving a run-time reduction of 99.15%.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133411703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Wave-domain echo-path model with aliasing for echo cancellation 用混叠法消除回波的波域回波路径模型
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701844
S. Emura, Y. Hiwasaki, H. Ohmuro
{"title":"Wave-domain echo-path model with aliasing for echo cancellation","authors":"S. Emura, Y. Hiwasaki, H. Ohmuro","doi":"10.1109/WASPAA.2013.6701844","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701844","url":null,"abstract":"Wave-domain adaptive filtering for echo cancellation has been proposed for achieving immersive full-duplex sound conferencing that uses wave field reconstruction as spatial sound rendering. In wave-domain adaptive filtering, fundamental solutions of the wave equation are spatially sampled and used as the orthogonal basis functions. This sampling is determined by loudspeaker spacing and results in aliasing; aliasing occurs above a few thousand Hz for spacing of several centimeters. The goal of this work is to investigate the effect of applying adaptive filtering on echo signal with aliasing when the loudspeaker array and microphone array are uniform linear arrays of identical geometries. We came to the conclusion that we can apply the wave-domain echo-path model, used below spatial Nyquist frequency, to wave-domain adaptive filtering over this frequency even in the presence of aliasing components.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114962131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Advanced speech-audio processing in mobile phones and hearing aids: Synergies and distinctions 移动电话和助听器中的高级语音音频处理:协同作用和区别
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701899
P. Vary
{"title":"Advanced speech-audio processing in mobile phones and hearing aids: Synergies and distinctions","authors":"P. Vary","doi":"10.1109/WASPAA.2013.6701899","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701899","url":null,"abstract":"Summary form only given. Mobile phones and modern hearing aids comprise advanced digital signal processing techniques as well as coding algorithms. From a functional point of view, digital hearing devices and mobile phones are approaching each other. In both types of devices similar or partly even identical algorithms can be found such as echo, reverberation and feedback control, noise reduction, intelligibility enhancement, artificial bandwidth extension, and binaural processing with two or more microphones. Actual hearing aids include digital audio receivers and transmitters not only for communication and entertainment but also for binaural directional processing. State-of-the-art mobile phones offer new speech-audio compression schemes for the emerging HD-telephone services and they are equipped with two (or more) microphones for the purpose of speech enhancement. Thus, it is not a too big step to realize hearing aid features as apps on smart phones. The further evolution might lead us to binaural mobile telephony, providing ambient and spatial information - a preferred solution for audio conferencing, for example. Despite these relations, the signal conditions and the processing constraints are quite different, e.g., with respect to coherence of signals, complexity of algorithms, coding-noise shaping for binaural processing, power consumption, and latency. Synergies and distinctions of the corresponding signal processing and coding algorithms will be discussed. Design constraints and solutions will be presented by examples.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131832327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信