2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics最新文献

筛选
英文 中文
Using articulation index band correlations to objectively estimate speech intelligibility consistent with the modified rhyme test 利用发音指标频带相关性客观评价与修正韵脚测试相一致的语音可理解性
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-23 DOI: 10.1109/WASPAA.2013.6701826
S. Voran
{"title":"Using articulation index band correlations to objectively estimate speech intelligibility consistent with the modified rhyme test","authors":"S. Voran","doi":"10.1109/WASPAA.2013.6701826","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701826","url":null,"abstract":"We present an objective estimator of speech intelligibility that follows the paradigm of the Modified Rhyme Test (MRT). For each input, the estimator uses temporal correlations within articulation index bands to select one of six possible words from a list. The rate of successful word identification becomes the measure of speech intelligibility, as in the MRT. The estimator is called Articulation Band Correlation MRT (ABC-MRT). It consumes a tiny fraction of the resources required by MRT testing. ABC-MRT has been tested on a wide range of impaired speech recordings unseen during development. The resulting Pearson correlations between ABC-MRT and MRT results range from .95 to .99. These values exceed those of the other estimators tested.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126353000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Roomprints for forensic audio applications 用于法医音频应用的Roomprints
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-20 DOI: 10.1109/WASPAA.2013.6701854
Alastair H. Moore, M. Brookes, P. Naylor
{"title":"Roomprints for forensic audio applications","authors":"Alastair H. Moore, M. Brookes, P. Naylor","doi":"10.1109/WASPAA.2013.6701854","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701854","url":null,"abstract":"A roomprint is a quantifiable description of an acoustic environment which can be measured under controlled conditions and estimated from a monophonic recording made in that space. We here identify the properties required of a roomprint in forensic audio applications and review the observable characteristics of a room that, when extracted from recordings, could form the basis of a room-print. Frequency-dependent reverberation time is investigated as a promising characteristic and used in a room identification experiment giving correct identification in 96% of trials.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116625099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
The geometry of sound-source localization using non-coplanar microphone arrays 使用非共面传声器阵列的声源定位几何
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-20 DOI: 10.1109/WASPAA.2013.6701896
Xavier Alameda-Pineda, R. Horaud, B. Mourrain
{"title":"The geometry of sound-source localization using non-coplanar microphone arrays","authors":"Xavier Alameda-Pineda, R. Horaud, B. Mourrain","doi":"10.1109/WASPAA.2013.6701896","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701896","url":null,"abstract":"This paper addresses the task of sound-source localization from time delay estimates using arbitrarily shaped non-coplanar microphone arrays. We fully exploit the direct path propagation model and our contribution is threefold: we provide a necessary and sufficient condition for a set of time delays to correspond to a sound source position, a proof of the uniqueness of this position, and a localization mapping to retrieve it. The time delay estimation task is casted into a non-linear multivariate optimization problem constrained by necessary and sufficient conditions on time delays. Two global optimization techniques to estimate time delays and localize the sound source are investigated. We report an extensive set of experiments and comparisons with state-of-the-art methods on simulated and real data in the presence of noise and reverberations.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122082798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Sound acquisition in noisy and reverberant environments using virtual microphones 使用虚拟麦克风在嘈杂和混响环境中进行声音采集
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701869
K. Kowalczyk, O. Thiergart, A. Craciun, Emanuël Habets
{"title":"Sound acquisition in noisy and reverberant environments using virtual microphones","authors":"K. Kowalczyk, O. Thiergart, A. Craciun, Emanuël Habets","doi":"10.1109/WASPAA.2013.6701869","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701869","url":null,"abstract":"In hands-free communication applications, the main goal is to capture desired sounds, while reducing noise and interfering sounds. However, for natural-sounding telepresence systems, the spatial sound image should also be preserved. Using a recently proposed method for generating the signal of a virtual microphone (VM), one can recreate the sound image from an arbitrary point of view in the sound scene (e.g., close to a desired speaker), while being able to place the physical microphones outside the sound scene. In this paper, we present a method for synthesizing a VM signal in noisy and reverberant environments, where the estimation of the required direct and diffuse sound components is performed using two multichannel linear filters. The direct sound component is estimated using a multichannel Wiener filter, while the diffuse sound component is estimated using a linearly constrained minimum variance filter followed by a single-channel Wiener filter. Simulations in a noisy and reverberant environment show the applicability of the proposed method for sound acquisition in a scenario in which two microphone arrays are installed in a large TV.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115134600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Broadband sensor location selection using convex optimization in very large scale arrays 基于凸优化的超大规模阵列宽带传感器位置选择
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701889
Y. Lai, R. Balan, Heiko Claussen, J. Rosca
{"title":"Broadband sensor location selection using convex optimization in very large scale arrays","authors":"Y. Lai, R. Balan, Heiko Claussen, J. Rosca","doi":"10.1109/WASPAA.2013.6701889","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701889","url":null,"abstract":"Consider a sensing system using a large number of N microphones placed in multiple dimensions to monitor a broadband acoustic field. Using all the microphones at once is impractical because of the amount of data generated. Instead, we choose a subset of D microphones to be active. Specifically, we wish to find the set of D microphones that minimizes the largest interference gain at multiple frequencies while monitoring a target of interest. A direct, combinatorial approach - testing all N choose D subsets of microphones - is impractical because of the problem size. Instead, we use a convex optimization technique that induces sparsity through a l1-penalty to determine which subset of microphones to use. We test the robustness of the our solution through simulated annealing and compare its performance against a classical beamformer which maximizes SNR. Since switching from a subset of D microphones to another subset of D microphones at every sample is possible, we construct a space-time-frequency sampling scheme that achieves near optimal performance.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122876159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A fast Griffin-Lim algorithm 一种快速Griffin-Lim算法
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701851
Nathanael Perraudin, P. Balázs, P. Søndergaard
{"title":"A fast Griffin-Lim algorithm","authors":"Nathanael Perraudin, P. Balázs, P. Søndergaard","doi":"10.1109/WASPAA.2013.6701851","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701851","url":null,"abstract":"In this paper, we present a new algorithm to estimate a signal from its short-time Fourier transform modulus (STFTM). This algorithm is computationally simple and is obtained by an acceleration of the well-known Griffin-Lim algorithm (GLA). Before deriving the algorithm, we will give a new interpretation of the GLA and formulate the phase recovery problem in an optimization form. We then present some experimental results where the new algorithm is tested on various signals. It shows not only significant improvement in speed of convergence but it does as well recover the signals with a smaller error than the traditional GLA.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122950504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 128
Hierarchical modeling using automated sub-clustering for sound event recognition 基于自动子聚类的声音事件识别分层建模
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701862
M. Niessen, T. V. Kasteren, A. Merentitis
{"title":"Hierarchical modeling using automated sub-clustering for sound event recognition","authors":"M. Niessen, T. V. Kasteren, A. Merentitis","doi":"10.1109/WASPAA.2013.6701862","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701862","url":null,"abstract":"The automatic recognition of sound events allows for novel applications in areas such as security, mobile and multimedia. In this work we present a hierarchical hidden Markov model for sound event detection that automatically clusters the inherent structure of the events into sub-events. We evaluate our approach on an IEEE audio challenge dataset consisting of office sound events and provide a systematic comparison of the various building blocks of our approach to demonstrate the effectiveness of incorporating certain dependencies in the model. The hierarchical hidden Markov model achieves an average frame-based F-measure recognition performance of 45.5% on a test dataset that was used to evaluate challenge submissions. We also show how the hierarchical model can be used as a meta-classifier, although in the particular application this did not lead to an increase in performance on the test dataset.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"349 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122041101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Estimation of room dimensions from a single impulse response 从单个脉冲响应估计房间尺寸
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701867
Dejan Markovic, F. Antonacci, A. Sarti, S. Tubaro
{"title":"Estimation of room dimensions from a single impulse response","authors":"Dejan Markovic, F. Antonacci, A. Sarti, S. Tubaro","doi":"10.1109/WASPAA.2013.6701867","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701867","url":null,"abstract":"In this paper we propose a methodology for the estimation of the geometry of an environment based on a single Acoustic Impulse Response (AIR). The estimation algorithm makes use of tools for the modeling of propagation based on geometrical acoustics. A suitable cost function evaluates the distance between the simulated and measured AIRs. The room minimizing the cost function is chosen as the correct one. The cost function is strongly non linear. As a consequence, in order to reduce the complexity of the minimization problem, the algorithm needs a hypothesis about the class of geometry of the environment under analysis, such as rectangular or L-shaped rooms. We prove the effectiveness of the proposed algorithm with a number of simulations with increasing complexity.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130357923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Sparse representation and epoch estimation of voiced speech 浊音语音的稀疏表示与epoch估计
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701885
J. Gunther, T. Moon
{"title":"Sparse representation and epoch estimation of voiced speech","authors":"J. Gunther, T. Moon","doi":"10.1109/WASPAA.2013.6701885","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701885","url":null,"abstract":"Whereas most approaches to linear speech prediction fail to account for the quasi-periodic glottal flow, this paper incorporates a model for the glottal flow derivative (GFD) directly into the linear prediction problem. A linear model for the prediction error is obtained by constructing a dictionary of time-shifted GFD pulses. The pulses are constructed by applying glottal inverse filtering (GIF) to recorded speech. Minimizing the difference between the linear prediction residual and a sparse combination of the pulses in the dictionary leads to joint estimation of the linear predictor as well as a sparse representation for the prediction error that reveals the instants of vocal tract excitation (epochs). The method is applied to voiced segments extracted from the CMU Arctic dataset which also includes electro-glottograms. Results show that the proposed method is effective in estimating the parameters of interest and that GIF-based pulses more accurately model GFD pulses occurring in real speech than pulses computed using the mathematical models.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"168 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113984888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multichannel HR-NMF for modelling convolutive mixtures of non-stationary signals in the time-frequency domain 多通道HR-NMF在时频域模拟非平稳信号的卷积混合
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics Pub Date : 2013-10-01 DOI: 10.1109/WASPAA.2013.6701824
R. Badeau, Mark D. Plumbley
{"title":"Multichannel HR-NMF for modelling convolutive mixtures of non-stationary signals in the time-frequency domain","authors":"R. Badeau, Mark D. Plumbley","doi":"10.1109/WASPAA.2013.6701824","DOIUrl":"https://doi.org/10.1109/WASPAA.2013.6701824","url":null,"abstract":"Several probabilistic models involving latent components have been proposed for modelling time-frequency (TF) representations of audio signals (such as spectrograms), notably in the nonnegative matrix factorization (NMF) literature. Among them, the recent high resolution NMF (HR-NMF) model is able to take both phases and local correlations in each frequency band into account, and its potential has been illustrated in applications such as source separation and audio inpainting. In this paper, HR-NMF is extended to multichannel signals and to convolutive mixtures. A fast variational expectation-maximization (EM) algorithm is proposed to estimate the enhanced model. This algorithm is applied to a stereophonic piano signal, and proves capable of accurately modelling reverberation and restoring missing observations.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114798562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信