Johan Brynolfsson, Johan Sward, A. Jakobsson, M. Hansson
{"title":"Smooth time-frequency estimation using covariance fitting","authors":"Johan Brynolfsson, Johan Sward, A. Jakobsson, M. Hansson","doi":"10.1109/icassp.2014.6853702","DOIUrl":"https://doi.org/10.1109/icassp.2014.6853702","url":null,"abstract":"In this paper, we introduce a time-frequency spectral estimator for smooth spectra, allowing for irregularly sampled measurements. A non-parametric representation of the time dependent (TD) covariance matrix is formed by assuming that the spectrum is piecewise linear. Using this representation, the time-frequency spectrum is then estimated by solving a convex covariance fitting problem, which also, as a byproduct, provides an enhanced estimation of the TD covariance matrix. Numerical examples using simulated non-stationary processes show the preferable performance of the proposed method as compared to the classical Wigner-Ville distribution and a smoothed spectrogram.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"28 1","pages":"779-783"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81078003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Maximum likelihood SNR estimation over time-varying flat-fading SIMO channels","authors":"F. Bellili, Rabii Meftehi, S. Affes, A. Stephenne","doi":"10.1109/ICASSP.2014.6854861","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854861","url":null,"abstract":"In this paper, we propose a new signal-to-noise-ratio (SNR) maximum likelihood (ML) estimator over time-varying single-input multiple-output (SIMO) channels, for both data-aided (DA) and non-data-aided (NDA) cases. Unlike the classical techniques which assume the channel to be slowly time-varying and, therefore, considered as constant during the observation period, we address the more challenging problem of instantaneous SNR estimation over fast time-varying channels. The channel variations are locally tracked using a polynomial-in-time expansion. In the DA scenario, the ML estimator is developed in closed-form expression. In the NDA scenario, however, the ML estimates of the per-antenna SNRs are obtained iteratively, with very few iterations, using the expectation-maximization (EM) procedure. Our estimator is able to accurately estimate the instantaneous SNRs over a wide range of average SNR. We show through extensive Monte-Carlo simulations that the new estimator outperforms previously developed solutions.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"35 1","pages":"6523-6527"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81236712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Wijnholds, A. V. D. Veen, F. D. Stefani, E. L. Rosa, A. Farina
{"title":"Signal processing challenges for radio astronomical arrays","authors":"S. Wijnholds, A. V. D. Veen, F. D. Stefani, E. L. Rosa, A. Farina","doi":"10.1109/ICASSP.2014.6854631","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854631","url":null,"abstract":"Current and future radio telescopes, in particular the Square Kilometre Array (SKA), are envisaged to produce large images (> 108 pixels) with over 60 dB dynamic range. This poses a number of image reconstruction and technological challenges, which will require novel approaches to image reconstruction and design of data processing systems. In this paper, we sketch the limitations of current algorithms by extrapolating their computational requirements to future radio telescopes as well as by discussing their imaging limitations. We discuss a number of potential research directions to cope with these challenges.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"35 1","pages":"5382-5386"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82036397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple-average-voice-based speech synthesis","authors":"P. Lanchantin, M. Gales, Simon King, J. Yamagishi","doi":"10.1109/ICASSP.2014.6853603","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6853603","url":null,"abstract":"This paper describes a novel approach for the speaker adaptation of statistical parametric speech synthesis systems based on the interpolation of a set of average voice models (AVM). Recent results have shown that the quality/naturalness of adapted voices depends on the distance from the average voice model used for speaker adaptation. This suggests the use of several AVMs trained on carefully chosen speaker clusters from which a more suitable AVM can be selected/interpolated during the adaptation. In the proposed approach a set of AVMs, a multiple-AVM, is trained on distinct clusters of speakers which are iteratively re-assigned during the estimation process initialised according to metadata. During adaptation, each AVM from the multiple-AVM is first adapted towards the target speaker. The adapted means from the AVMs are then interpolated to yield the final speaker adapted mean for synthesis. It is shown, performing speaker adaptation on a corpus of British speakers with various regional accents, that the quality/naturalness of synthetic speech of adapted voices is significantly higher than when considering a single factor-independent AVM selected according to the target speaker characteristics.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"17 1","pages":"285-289"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78561822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John S. Kota, N. Kovvali, D. Bliss, A. Papandreou-Suppappola
{"title":"Waveform selection for range and Doppler estimation via Barankin bound signal-to-noise ratio threshold","authors":"John S. Kota, N. Kovvali, D. Bliss, A. Papandreou-Suppappola","doi":"10.1109/ICASSP.2014.6854485","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854485","url":null,"abstract":"In this paper, we consider the tracking of a radar target with unknown range and range rate at low signal-to-noise ratio (SNR). For this nonlinear estimation problem, the Cramér-Rao lower bound (CRLB) provides a bound on an unbiased estimator's mean-squared error (MSE). However, there exists a threshold SNR at which the estimator variance deviates from the CRLB. We consider the Barankin bound (BB) on the range and range-rate variance in order to obtain a tighter lower bound at low SNR, and we use the BB to predict the SNR threshold for a transmitted signal. We demonstrate that the BB with the additional information provided by the threshold SNR has an advantage over the CRLB in selecting the optimal transmit waveform at low SNRs. We also develop a waveform parameter configuration method that uses the BB and the ambiguity function resolution cell measurement model to optimize the SNR threshold.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"12 1","pages":"4658-4662"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79520584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Álvarez-Meza, G. Castellanos-Domínguez, J. Príncipe
{"title":"Functional relevant multichannel kernel adaptive filter for human activity analysis","authors":"A. Álvarez-Meza, G. Castellanos-Domínguez, J. Príncipe","doi":"10.1109/ICASSP.2014.6854427","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854427","url":null,"abstract":"A multichannel kernel adaptive filtering framework is presented that highlights relevant channels for the task of analyzing Motion Capture (MoCap) data. Functional relevance analysis is performed over input multichannel data by computing the pair-wise channel similarities to describe the main behavior of the considered applications. Particularly, the well-known Kernel Least Mean Square filter is enhanced using a correntropy-based similarity criterion between channel pairs. Besides, two sparseness criteria are studied to extract a sample subset that constructs a learning model displaying a good trade-off between filter complexity and accuracy. The proposed approach allows devising complex relationship among multi-channel time-series, revealing dependencies among the channels and the process time-structure. The method is tested in a well-known MoCap data set. Results show that our framework is an adequate alternative for finding functional relevance amongst multi-channel time-series.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"23 1","pages":"4369-4373"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85696438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Arora, V. Venkataraman, S. Donohue, K. Biglan, E. Dorsey, Max A. Little
{"title":"High accuracy discrimination of Parkinson's disease participants from healthy controls using smartphones","authors":"S. Arora, V. Venkataraman, S. Donohue, K. Biglan, E. Dorsey, Max A. Little","doi":"10.1109/ICASSP.2014.6854280","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854280","url":null,"abstract":"The aim of this study is to accurately distinguish Parkinson's disease (PD) participants from healthy controls using self-administered tests of gait and postural sway. Using consumer-grade smartphones with in-built accelerometers, we objectively measure and quantify key movement severity symptoms of Parkinson's disease. Specifically, we record tri-axial accelerations, and extract a range of different features based on the time and frequency-domain properties of the acceleration time series. The features quantify key characteristics of the acceleration time series, and enhance the underlying differences in the gait and postural sway accelerations between PD participants and controls. Using a random forest classifier, we demonstrate an average sensitivity of 98.5% and average specificity of 97.5% in discriminating PD participants from controls.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"63 1","pages":"3641-3644"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86026328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Virendra Karappa, C. D. D. Monteiro, F. Shipman, R. Gutierrez-Osuna
{"title":"Detection of sign-language content in video through polar motion profiles","authors":"Virendra Karappa, C. D. D. Monteiro, F. Shipman, R. Gutierrez-Osuna","doi":"10.1109/ICASSP.2014.6853805","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6853805","url":null,"abstract":"Locating sign language (SL) videos on video sharing sites (e.g., YouTube) is challenging because search engines generally do not use the visual content of videos for indexing. Instead, indexing is done solely based on textual content (e.g., title, description, metadata). As a result, untagged SL videos do not appear in the search results. In this paper, we present and evaluate a classification approach to detect SL videos based on their visual content. The approach uses an ensemble of Haar-based face detectors to define regions of interest (ROI), and a background model to segment movements in the ROI. The two-dimensional (2D) distribution of foreground pixels in the ROI is then reduced to two 1D polar motion profiles by means of a polar-coordinate transformation, and then classified by means of an SVM. When evaluated on a dataset of user-contributed YouTube videos, the approach achieves 81% precision and 94% recall.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"78 1","pages":"1290-1294"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84061384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jingshan Zhong, Rene A. Claus, J. Dauwels, L. Tian, L. Waller
{"title":"Non-uniform sampling and Gaussian process regression in transport of intensity phase imaging","authors":"Jingshan Zhong, Rene A. Claus, J. Dauwels, L. Tian, L. Waller","doi":"10.1109/ICASSP.2014.6855115","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6855115","url":null,"abstract":"Gaussian process (GP) regression is a nonparametric regression method that can be used to predict continuous quantities. Here, we show that the same technique can be applied to a class of phase imaging techniques based on measurements of intensity at multiple propagation distances, i.e. the transport of intensity equation (TIE). In this paper, we demonstrate how to apply GP regression to estimate the first intensity derivative along the direction of propagation and incorporate non-uniform propagation distance sampling. The low-frequency artifacts that often occur in phase recovery using traditional methods can be significantly suppressed by the proposed GP TIE method. The method is shown to be stable with moderate amounts of Gaussian noise. We validate the method experimentally by recovering the phase of human cheek cells in a bright field microscope and show better performance as compared to other TIE reconstruction methods.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"105 1","pages":"7784-7788"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78360496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Markus V. S. Lima, I. Sobrón, W. Martins, P. Diniz
{"title":"Stability and MSE analyses of affine projection algorithms for sparse system identification","authors":"Markus V. S. Lima, I. Sobrón, W. Martins, P. Diniz","doi":"10.1109/ICASSP.2014.6854836","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854836","url":null,"abstract":"We analyze two algorithms, viz. the affine projection algorithm for sparse system identification (APA-SSI) and the quasi APA-SSI (QAPA-SSI), regarding their stability and steady-state mean-squared error (MSE). These algorithms exploit the sparsity of the involved signals through an approximation of the l0 norm. Such approach yields faster convergence and reduced steady-state MSE, as compared to algorithms that do not take the sparse nature of the signals into account. In addition, modeling sparsity via such approximation has been consistently verified to be superior to the widely used l1 norm in several scenarios. In this paper, we show how to properly set the parameters of the two aforementioned algorithms in order to guarantee convergence, and we derive closed-form theoretical expressions for their steady-state MSE. A key conclusion from the proposed analysis is that the MSE of these two algorithms is a monotonically decreasing function of the sparsity degree. Simulation results are used to validate the theoretical findings.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"20 1","pages":"6399-6403"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78523031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}