{"title":"Robust speech recognition under noisy environments using asymmetric tapers","authors":"Md. Jahangir Alam, P. Kenny, D. O'Shaughnessy","doi":"10.5281/ZENODO.43036","DOIUrl":"https://doi.org/10.5281/ZENODO.43036","url":null,"abstract":"This paper presents asymmetric taper (or window)-based robust Mel frequency cepstral coefficient (MFCC) feature extraction for automatic speech recognition (ASR). Commonly, MFCC features are computed from a symmetric Hamming-tapered direct-spectrum estimate. Symmetric tapers have linear phase and also imply longer time delay. In ASR systems, phase information is usually discarded as human speech perception is relatively insensitive to short-time phase distortion. So, any linearity constraint on phase can be removed without adverse effects. Use of asymmetric tapers, having better frequency response and shorter time delay, for MFCC feature extraction in speech recognition can lead to better recognition performance. Using our proposed method it is possible to introduce asymmetry in any symmetric taper by adjusting only one additional parameter, which controls the degree of asymmetry. Experimental results on the AURORA-2 corpus show that the proposed asymmetric tapers outperform the symmetric Hamming taper in terms of word accuracy both in clean and noisy environments.","PeriodicalId":201182,"journal":{"name":"2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114837869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"P-norm minimization over intersections of convex sets","authors":"I. Bayram","doi":"10.5281/ZENODO.42790","DOIUrl":"https://doi.org/10.5281/ZENODO.42790","url":null,"abstract":"We consider the minimization of the ℓp norm subject to convex constraints. The problem considered in this paper may be regarded as a relaxation of a similar problem that employs the ℓ1 norm. We derive the dual problem, which is unconstrained and devise an algorithm for the dual problem by adapting the Douglas-Rachford algorithm. We demonstrate the utility of the algorithm on an experiment and discuss its differences with an existing algorithm.","PeriodicalId":201182,"journal":{"name":"2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117297579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multimode spatiotemporal background modeling for complex scenes","authors":"Li Sun, Quentin De Neyer, C. Vleeschouwer","doi":"10.5281/ZENODO.43174","DOIUrl":"https://doi.org/10.5281/ZENODO.43174","url":null,"abstract":"We present a new approach for modeling background in complex scenes that contain motions caused e.g. by wind over water surface, in tree branches, or over the grass. The background model of each pixel is defined based on the observation of its spatial neighborhood in a recent history, and includes up to K ≥ 1 modes, ranked in decreasing order of occurrence frequency. Foreground regions can then be detected by comparing the intensity of an observed pixel to the high frequency modes of its background model. Experiments show that our spatial-temporal background model is superior to traditional related algorithms in cases for which a pixel encounters modes that are frequent in the spatial neighborhood without being frequent enough in the actual pixel position. As an additional contribution, our paper also proposes an original assessment method, which has the advantage of avoiding the use of costly handmade ground truth sequences of foreground objects silhouettes.","PeriodicalId":201182,"journal":{"name":"2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115846666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved estimation of probabilities in pronunciation by Analogy","authors":"J. Kujala, A. Nandi","doi":"10.5281/ZENODO.43156","DOIUrl":"https://doi.org/10.5281/ZENODO.43156","url":null,"abstract":"Pronunciation by Analogy is a method for generating phonetic transcriptions for previously unseen written words based on matching substrings of known words and their pronunciations. The method inherently generates several candidate pronunciations and a multitude of heuristics have been proposed for choosing the best one. In [1], a theoretically justified probabilistic approach for scoring the pronunciations was proposed, with performance on par with the best heuristic methods. However, a certain ad hoc modification - a fractional power applied to the estimated probabilities of the substring pronunciations - was also found to improve performance. In this article, we give an explanation for this unexpected improvement. We show that the fractional power in fact improves the estimates of the candidate pronunciation probabilities. This also gives an indirect explanation of the good performance of the current best heuristic proposed in [2].","PeriodicalId":201182,"journal":{"name":"2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115910572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using statistical room acoustics for analysing the output SNR of the MWF in acoustic sensor networks","authors":"Toby Christian Lawin-Ore, S. Doclo","doi":"10.5281/ZENODO.43244","DOIUrl":"https://doi.org/10.5281/ZENODO.43244","url":null,"abstract":"In the context of acoustic sensor networks with spatially distributed microphones, the selection of the subset of microphones yielding the best performance is of great interest. Subset selection can be achieved by comparing the theoretical performance of different subsets of microphones. In this paper, we derive an analytical expression for the spatially averaged output SNR of the multi-channel Wiener filter (MWF) in a diffuse noise field, exploiting the statistical properties of the acoustic transfer functions (ATFs) between the desired source and the microphones. This analytical expression only requires the room properties and the source-microphone distances to be known. Simulation results show that the spatially averaged output SNR obtained using the statistical properties of ATFs is similar to the average output SNR obtained using simulated ATFs, therefore providing an efficient way to compare the performance of different subsets of microphones.","PeriodicalId":201182,"journal":{"name":"2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123359220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A convex inner approximation technique for rank-two beamforming in multicasting relay networks","authors":"A. Schad, K. L. Law, M. Pesavento","doi":"10.5281/ZENODO.52312","DOIUrl":"https://doi.org/10.5281/ZENODO.52312","url":null,"abstract":"In this paper, we propose a novel scheme for single-group multicasting using a relay network. We assume a source that transmits messages via an amplify-and-forward relay network to multiple destinations. The goal is to minimize the maximum transmitted power of the relays under constraints on the signal-to-noise ratios at the destinations. To increase the degrees of freedoms in the system, the relays process two source signals jointly, using two different relay beamforming weight vectors. The Alamouti space-time block code is transmitted over two beams. Simulation results demonstrate the performance of the proposed scheme combined with a proposed sequential convex programming algorithm compared to methods of the literature and to the theoretical lower bound.","PeriodicalId":201182,"journal":{"name":"2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122535769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Collaborative diffusive source localization in wireless sensor networks","authors":"Sabina Zejnilovic, J. Gomes, B. Sinopoli","doi":"10.5281/ZENODO.42977","DOIUrl":"https://doi.org/10.5281/ZENODO.42977","url":null,"abstract":"We propose a collaborative, energy efficient method for diffusive source localization in wireless sensor networks. The algorithm is based on distributed and iterative maximum-likelihood (ML) estimation, which is very sensitive to initialization. As a part of the proposed method we present an approach for obtaining a “good enough” initial value for the ML recursion based on infinite time approximation and semidefinite programming. We also present an approach for determining the sensor node that initiates the estimation process. To improve the convergence rate of the algorithm, we consider the case where selected nodes collaborate with their neighbors. Simulation results are used to characterize the performance and energy efficiency of the algorithm. We also illustrate estimation accuracy/energy consumption trade-off by varying the communication radius of sensor nodes.","PeriodicalId":201182,"journal":{"name":"2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126305462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An efficient kernel adaptive filtering algorithm using hyperplane projection along affine subspace","authors":"M. Yukawa, R. Ishii","doi":"10.5281/ZENODO.52430","DOIUrl":"https://doi.org/10.5281/ZENODO.52430","url":null,"abstract":"We propose a novel kernel adaptive filtering algorithm that selectively updates a few coefficients at each iteration by projecting the current filter onto the zero instantaneous-error hyperplane along a certain time-dependent affine subspace. Coherence is exploited for selecting the coefficients to be updated as well as for measuring the novelty of new data. The proposed algorithm is a natural extension of the normalized kernel least mean squares algorithm operating iterative hyperplane projections in a reproducing kernel Hilbert space. The proposed algorithm enjoys low computational complexity. Numerical examples indicate high potential of the proposed algorithm.","PeriodicalId":201182,"journal":{"name":"2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129374942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new class of FLANN filters with application to nonlinear active noise control","authors":"A. Carini, G. Sicuranza","doi":"10.5281/ZENODO.52604","DOIUrl":"https://doi.org/10.5281/ZENODO.52604","url":null,"abstract":"FLANN and generalized FLANN filters exploiting trigonometric functions are often used in active noise control. However, they cannot approximate arbitrarily well every causal, time-invariant, finite-memory, nonlinear system, i.e., they are not universal approximators as the Volterra filters. In this paper, we propose a novel class of FLANN filters, called Complete FLANN filters, which satisfy the Stone-Weierstrass theorem, and thus can arbitrarily well approximate any nonlinear, time-invariant, finite-memory, continuous system. CFLANN filters are members of the class of nonlinear filters characterized by the property that their output depends linearly on the filter coefficients. As a consequence, they can be efficiently implemented in the form of a filter bank and adapted using algorithms simply derived from those applied to linear filters. In the paper, we apply a nonlinearly Filtered-X NLMS algorithm for CFLANN filters and describe some useful applications in the area of nonlinear active noise control.","PeriodicalId":201182,"journal":{"name":"2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129755020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the implementation of fully adaptive interpolated FIR filters","authors":"E. Batista, R. Seara","doi":"10.5281/ZENODO.43115","DOIUrl":"https://doi.org/10.5281/ZENODO.43115","url":null,"abstract":"This paper presents a novel strategy for implementing fully adaptive interpolated finite impulse response (FAIFIR) structures using either the least-mean-square (LMS) or the normalized LMS algorithms. The aim of such a strategy is to mitigate numerical stability issues arising from simultaneously adapting the two cascaded filters (sparse filter and interpolator) that compose a FAIFIR structure. In this context, a modification in the structure of the interpolator is proposed with no impact on both the computational complexity and applicability of the FAIFIR structure. As a result, adaptive filters with enhanced numerical properties are obtained. Numerical simulation results are presented attesting the effectiveness of the proposed strategy.","PeriodicalId":201182,"journal":{"name":"2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129407831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}