2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics最新文献

筛选
英文 中文
Influence of secondary path estimation errors on the performance of ANC-motivated noise reduction algorithms for hearing aids 辅助路径估计误差对助听器ac驱动降噪算法性能的影响
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701812
Derya Dalga, S. Doclo
{"title":"Influence of secondary path estimation errors on the performance of ANC-motivated noise reduction algorithms for hearing aids","authors":"Derya Dalga, S. Doclo","doi":"10.1109/WASPAA.2013.6701812","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701812","url":null,"abstract":"Current noise reduction techniques for open-fitting hearing aids that only use the external microphones on the hearing aid typically disregard the occurrence of signal leakage through the open fitting, leading to a degraded noise reduction performance. Using an ear mould with an internal (so-called error) microphone provides information about the signal leakage and hence enables to improve the noise reduction performance. Recently, feedforward and combined feedforward-feedback active-noise-control-motivated (FF ANC and FF-FB ANC, respectively) algorithms for noise reduction have been presented for such open-fitting hearing aids. The noise reduction filters of these ANC-motivated algorithms depend on an estimate of the so-called secondary path between the receiver and the error microphone. In this paper, we analyze the influence of secondary path estimation errors on the performance of the ANC-motivated algorithms. For the FF ANC algorithm it is possible to derive a closed-form expression of the filter as a function of the secondary path estimation error and to derive limit values for the allowable secondary path estimation errors. In addition, simulations show that even when estimation errors occur the FF-FB ANC algorithm still outperforms the FF ANC algorithm.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115925803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A simple adaptive cardioid direction finding algorithm 一个简单的自适应心脏测向算法
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701831
G. Elko, Jens Meyer
{"title":"A simple adaptive cardioid direction finding algorithm","authors":"G. Elko, Jens Meyer","doi":"10.1109/WASPAA.2013.6701831","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701831","url":null,"abstract":"A simple adaptive cardioid direction-finder algorithm using signals from closely spaced omnidirectional microphones is described. One implementation utilizes a computationally simple constrained LMS adaptive filter with only 3-taps for the general 3D case and 2-taps for the 2D case. The solution adaptively finds the location of the single cardioid null that minimizes the output power of a generally 2D or 3D rotated cardioid.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131396701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A sparse nonuniformly partitioned multidelay filter for acoustic echo cancellation 一种用于回声消除的稀疏非均匀分割多延迟滤波器
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701832
D. Giacobello, Joshua Atkins
{"title":"A sparse nonuniformly partitioned multidelay filter for acoustic echo cancellation","authors":"D. Giacobello, Joshua Atkins","doi":"10.1109/WASPAA.2013.6701832","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701832","url":null,"abstract":"In this paper, we propose a formulation of the multidelay adaptive filter for acoustic echo cancellation by modeling the echo path using sparse nonuniform partitions. The nonuniform partitioning allows for a low algorithmic delay without sacrificing the high order of the adaptive filter. It also further improves upon the computational efficiency of the uniformly partitioned multidelay filter by leveraging larger FFT sizes for certain partitions. The sparsity constraint allows for the definition of active and inactive regions of the adaptive filter, providing a better estimate of the order of the filter. Simulation results are provided showing increased convergence speed with the same steady-state misalignment compared to traditional multidelay filtering with both uniform and nonuniform partitioning.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"2007 14","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132678016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Geometrically Constrained TRINICON-based relative transfer function estimation in underdetermined scenarios 欠确定场景下基于几何约束trinicon的相对传递函数估计
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701822
K. Reindl, S. M. Golan, Hendrik Barfuss, S. Gannot, Walter Kellermann
{"title":"Geometrically Constrained TRINICON-based relative transfer function estimation in underdetermined scenarios","authors":"K. Reindl, S. M. Golan, Hendrik Barfuss, S. Gannot, Walter Kellermann","doi":"10.1109/WASPAA.2013.6701822","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701822","url":null,"abstract":"Speech extraction in a reverberant enclosure using a linearly-constrained minimum variance (LCMV) beamformer usually requires reliable estimates of the relative transfer functions (RTFs) of the desired source to all microphones. In this contribution, a geometrically constrained (GC)-TRINICON concept for RTF estimation is proposed. This approach is applicable in challenging multiple-speaker scenarios and in underdetermined situations, where the number of simultaneously active sources outnumbers the number of available microphone signals. As a most practically relevant and distinctive feature, this concept does not require any voice-activity-based control mechanism. It only requires coarse reference information on the target direction of arrival (DoA). The proposed GC-TRINICON method is compared to a recently proposed subspace method for RTF estimation relying on voice-activity control. Experimental results confirm the effectiveness of GC-TRINICON in realistic conditions.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116986240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Acoustic scene classification using sparse feature learning and event-based pooling 基于稀疏特征学习和事件池的声学场景分类
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701893
Kyogu Lee, Ziwon Hyung, Juhan Nam
{"title":"Acoustic scene classification using sparse feature learning and event-based pooling","authors":"Kyogu Lee, Ziwon Hyung, Juhan Nam","doi":"10.1109/WASPAA.2013.6701893","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701893","url":null,"abstract":"Recently unsupervised learning algorithms have been successfully used to represent data in many of machine recognition tasks. In particular, sparse feature learning algorithms have shown that they can not only discover meaningful structures from raw data but also outperform many hand-engineered features. In this paper, we apply the sparse feature learning approach to acoustic scene classification. We use a sparse restricted Boltzmann machine to capture manyfold local acoustic structures from audio data and represent the data in a high-dimensional sparse feature space given the learned structures. For scene classification, we summarize the local features by pooling over audio scene data. While the feature pooling is typically performed over uniformly divided segments, we suggest a new pooling method, which first detects audio events and then performs pooling only over detected events, considering the irregular occurrence of audio events in acoustic scene data. We evaluate the learned features on the IEEE AASP Challenge development set, comparing them with a baseline model using mel-frequency cepstral coefficients (MFCCs). The results show that learned features outperform MFCCs, event-based pooling achieves higher accuracy than uniform pooling and, furthermore, a combination of the two methods performs even better than either one used alone.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123913954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Environment-aware ideal binary mask estimation using monaural cues 基于单信号的环境感知理想二值掩码估计
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701821
T. May, T. Dau
{"title":"Environment-aware ideal binary mask estimation using monaural cues","authors":"T. May, T. Dau","doi":"10.1109/WASPAA.2013.6701821","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701821","url":null,"abstract":"We present a monaural approach to speech segregation that estimates the ideal binary mask (IBM) by combining amplitude modulation spectrogram (AMS) features, pitch-based features and speech presence probability (SPP) features derived from noise statistics. To maintain a high mask estimation accuracy in the presence of various background noises, the system employs environment-specific segregation models and automatically selects the appropriate model for a given input signal. Furthermore, instead of classifying each time-frequency (T-F) unit independently, the a posteriori probabilities of speech and noise presence are evaluated by considering adjacent T-F units. The proposed system achieves high classification accuracy.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128263190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Learning an intelligibility map of individual utterances 学习单个话语的可理解度图
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701835
Michael I. Mandel
{"title":"Learning an intelligibility map of individual utterances","authors":"Michael I. Mandel","doi":"10.1109/WASPAA.2013.6701835","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701835","url":null,"abstract":"Predicting the intelligibility of noisy recordings is difficult and most current algorithms only aim to be correct on average across many recordings. This paper describes a listening test paradigm and associated analysis technique that can predict the intelligibility of a specific recording of a word in the presence of a specific noise instance. The analysis learns a map of the importance of each point in the recording's spectrogram to the overall intelligibility of the word when glimpsed through “bubbles” in many noise instances. By treating this as a classification problem, a linear classifier can be used to predict intelligibility and can be examined to determine the importance of spectral regions. This approach was tested on recordings of vowels and consonants. The important regions identified by the model in these tests agreed with those identified by a standard, non-predictive statistical test of independence and with the acoustic phonetics literature.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130513224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Speech understanding in noise provided by a simulated cochlear implant processor based on matching pursuit 基于匹配追踪的模拟人工耳蜗处理器提供噪声环境下的语音理解
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701878
A. Kressner, C. Rozell
{"title":"Speech understanding in noise provided by a simulated cochlear implant processor based on matching pursuit","authors":"A. Kressner, C. Rozell","doi":"10.1109/WASPAA.2013.6701878","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701878","url":null,"abstract":"Speech reception is poor for cochlear implant recipients in listening environments with interfering noise. This study investigates the speech understanding provided in interfering noise by a coding strategy based on the sparse approximation algorithm matching pursuit (MP) and additionally proposes two modifications to the strategy. The levels of spectral information provided by the MP strategy and the modified MP strategy are compared to that of continuous interleaved sampling (CIS) and a strategy based on the ideal binary mask (IBM) using vocoded speech and the normalized covariance metric (NCM). We demonstrate objective intelligibility improvements in quiet, and total and partial objective intelligibility restoration in steady-state and fluctuating noise, respectively.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130794662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Large-scale audio feature extraction and SVM for acoustic scene classification 大规模音频特征提取与支持向量机声学场景分类
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701857
Jürgen T. Geiger, Björn Schuller, G. Rigoll
{"title":"Large-scale audio feature extraction and SVM for acoustic scene classification","authors":"Jürgen T. Geiger, Björn Schuller, G. Rigoll","doi":"10.1109/WASPAA.2013.6701857","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701857","url":null,"abstract":"This work describes a system for acoustic scene classification using large-scale audio feature extraction. It is our contribution to the Scene Classification track of the IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (D-CASE). The system classifies 30 second long recordings of 10 different acoustic scenes. From the highly variable recordings, a large number of spectral, cepstral, energy and voicing-related audio features are extracted. Using a sliding window approach, classification is performed on short windows. SVM are used to classify these short segments, and a majority voting scheme is employed to get a decision for longer recordings. On the official development set of the challenge, an accuracy of 73 % is achieved. SVM are compared with a nearest neighbour classifier and an approach called Latent Perceptual Indexing, whereby SVM achieve the best results. A feature analysis using the t-statistic shows that mainly Mel spectra are the most relevant features.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116699477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 123
Speech enhancement for hearing instruments: Enabling communication in adverse conditions 助听器的语音增强:在不利条件下进行通信
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701897
Rainer Martin
{"title":"Speech enhancement for hearing instruments: Enabling communication in adverse conditions","authors":"Rainer Martin","doi":"10.1109/WASPAA.2013.6701897","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701897","url":null,"abstract":"Hearing instruments are frequently used in notoriously difficult acoustic scenarios. Even for normal-hearing people ambient noise, reverberation and echoes often contribute to a degraded communication experience. The impact of these factors becomes significantly more prominent when participants suffer from a hearing loss. Nevertheless, hearing instruments are frequently used in these adverse conditions and must enable effortless communication. In this talk I will discuss challenges that are encountered in acoustic signal processing for hearing instruments. While many algorithms are motivated by the quest for a cocktail party processor and by the high-level paradigms of auditory scene analysis a careful design of statistical models and processing schemes is necessary to achieve the required performance in real world applications. Rather strict requirements result from the size of the device, the power budget, and the admissable processing latency. Starting with low-latency spectral analysis and synthesis systems for speech and music signals I will continue highlighting statistical estimation and smoothing techniques for the enhancement of noisy speech. The talk emphasizes the necessity to find a good balance between temporal and spectral resolution, processing latency, and statistical estimation errors. It concludes with single and multi-channel speech enhancement examples and an outlook towards opportunities which reside in the use of comprehensive speech processing models and distributed resources.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124966146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信