Siddharth Bhela, V. Kekatos, Liang Zhang, S. Veeramachaneni
{"title":"Enhancing observability in power distribution grids","authors":"Siddharth Bhela, V. Kekatos, Liang Zhang, S. Veeramachaneni","doi":"10.1109/ICASSP.2017.7953018","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7953018","url":null,"abstract":"Power distribution grids are currently challenged by observability issues due to limited metering infrastructure. On the other hand, smart meter data, including local voltage magnitudes and power injections, are collected at grid nodes with renewable generation and demand-response programs. A power flow-based approach using these data is put forth here to infer the unknown power injections at non-metered grid nodes. Exploiting the control capabilities of smart inverters and the relative time-invariance of conventional loads, the idea is to solve the non-linear power flow equations jointly over two system realizations. An intuitive condition pertaining to the graph of the underlying grid is shown to be necessary and sufficient for the local identifiability of this task. The derived graph theoretic criterion can be checked efficiently and is numerically verified under realistic scenarios on the IEEE 13-bus feeder.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133160841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Q. Hoarau, A. Breloy, G. Ginolhac, A. Atto, J. Nicolas
{"title":"A subspace approach for shrinkage parameter selection in undersampled configuration for Regularised Tyler Estimators","authors":"Q. Hoarau, A. Breloy, G. Ginolhac, A. Atto, J. Nicolas","doi":"10.1109/ICASSP.2017.7952765","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952765","url":null,"abstract":"Regularized Tyler Estimator's (RTE) have raised attention over the past years due to their attractive performance over a wide range of noise distributions and their natural robustness to outliers. Developing adaptive methods for the selection of the regularisation parameter α is currently an active topic of research. Indeed, the bias-performance compromise of RTEs highly depends on the considered application. Thus, finding a generic rule that is optimal for every criterion and/or data configurations is not straightforward. This issue is addressed in this paper for undersampled configurations (number of samples lower than the dimension of the data). The paper proposes a new regularisation parameter selection based on a subspace reduction approach. The performance of this method is investigated in terms of estimation accuracy and for adaptive detection purposes, both on simulation and real data.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124991677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pramod B. Bachhav, M. Todisco, M. M. Idrissa, C. Beaugeant, N. Evans
{"title":"Artificial bandwidth extension using the constant Q transform","authors":"Pramod B. Bachhav, M. Todisco, M. M. Idrissa, C. Beaugeant, N. Evans","doi":"10.1109/ICASSP.2017.7953218","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7953218","url":null,"abstract":"Most artificial bandwidth extension (ABE) algorithms are based on the classical source-filter model of speech production. This approach generally requires the dual extension of each component through independent processing. Alternative approaches reported recently operate on the spectrum. With human perception thought to be largely insensitive to phase, most such approaches focus on the extension of the magnitude spectrum alone and rely on Fourier spectral analysis. This paper reports an approach to ABE based on the constant Q transform (CQT), a more perceptually motivated approach to spectral analysis. A Gaussian mixture model is used to estimate missing highband components from available narrowband components before resynthesis with phase estimates obtained from the upsampled narrowband signal. Objective assessment shows that energy normalisation is critical to performance. These findings and the appeal of CQT for ABE are confirmed through informal subjective tests based on the mean opinion score.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129254768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Salience based lexical features for emotion recognition","authors":"Kalani Wataraka Gamage, V. Sethu, E. Ambikairajah","doi":"10.1109/ICASSP.2017.7953274","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7953274","url":null,"abstract":"In this paper we focus on the usefulness of verbal events for speech based emotion recognition. In particular, the use of phoneme sequences to encode verbal cues related to the expression of emotions is proposed and lexical features based on these phoneme sequences are introduced for use in automatic emotion recognition systems where manual transcripts are not available. Secondly, a novel estimate of emotional salience of verbal cues, applicable to both phoneme sequences and words, is presented. Experimental results on the IEMOCAP database show that the proposed automatic phoneme sequence based features can achieve an Unweighted Average Recall (UAR) of 49% with proposed salience measure. Further, the proposed salience measure can lead to an UAR of 64% when using manual word transcriptions. Both of these are the highest UARs reported on the IEMOCAP database for systems using lexical features extracted from automatic and manual transcripts respectively.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131189949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hieu-Thi Luong, Shinji Takaki, G. Henter, J. Yamagishi
{"title":"Adapting and controlling DNN-based speech synthesis using input codes","authors":"Hieu-Thi Luong, Shinji Takaki, G. Henter, J. Yamagishi","doi":"10.1109/ICASSP.2017.7953089","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7953089","url":null,"abstract":"Methods for adapting and controlling the characteristics of output speech are important topics in speech synthesis. In this work, we investigated the performance of DNN-based text-to-speech systems that in parallel to conventional text input also take speaker, gender, and age codes as inputs, in order to 1) perform multi-speaker synthesis, 2) perform speaker adaptation using small amounts of target-speaker adaptation data, and 3) modify synthetic speech characteristics based on the input codes. Using a large-scale, studio-quality speech corpus with 135 speakers of both genders and ages between tens and eighties, we performed three experiments: 1) First, we used a subset of speakers to construct a DNN-based, multi-speaker acoustic model with speaker codes. 2) Next, we performed speaker adaptation by estimating code vectors for new speakers via backpropagation from a small amount of adaptation material. 3) Finally, we experimented with manually manipulating input code vectors to alter the gender and/or age characteristics of the synthesised speech. Experimental results show that high-performance multi-speaker models can be constructed using the proposed code vectors with a variety of encoding schemes, and that adaptation and manipulation can be performed effectively using the codes.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121883567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Anantrasirichai, M. Allinovi, W. Hayes, D. Bull, A. Achim
{"title":"Line detection in speckle images using Radon transform and ℓ1 regularization","authors":"N. Anantrasirichai, M. Allinovi, W. Hayes, D. Bull, A. Achim","doi":"10.1109/ICASSP.2017.7953356","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7953356","url":null,"abstract":"Boundaries and lines in medical images are important structures as they can delineate between tissue types, organs, and membranes. Although, a number of image enhancement and segmentation methods have been proposed to detect lines, none of these have considered line artefacts, which are more difficult to visualise as they are not physical structures, yet are still meaningful for clinical interpretation. This paper presents a novel method to restore lines, including line artefacts, in speckle images. We address this as a sparse estimation problem using a convex optimisation technique based on a Radon transform and sparsity regularisation (ℓ1 norm). This problem divides into subproblems which are solved using the alternating direction method of multipliers, thereby achieving line detection and deconvolution simultaneously. The results for both simulated and in vivo ultrasound images show that the proposed method outperforms existing methods, in particular for detecting B-lines in lung ultrasound images, where the performance can be improved by up to 30 %.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"205 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132219676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
X. Chen, A. Ragni, J. Vasilakes, X. Liu, K. Knill, M. Gales
{"title":"Recurrent neural network language models for keyword search","authors":"X. Chen, A. Ragni, J. Vasilakes, X. Liu, K. Knill, M. Gales","doi":"10.1109/ICASSP.2017.7953263","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7953263","url":null,"abstract":"Recurrent neural network language models (RNNLMs) have becoming increasingly popular in many applications such as automatic speech recognition (ASR). Significant performance improvements in both perplexity and word error rate over standard n-gram LMs have been widely reported on ASR tasks. In contrast, published research on using RNNLMs for keyword search systems has been relatively limited. In this paper the application of RNNLMs for the IARPA Babel keyword search task is investigated. In order to supplement the limited acoustic transcription data, large amounts of web texts are also used in large vocabulary design and LM training. Various training criteria were then explored to improved RNNLMs' efficiency in both training and evaluation. Significant and consistent improvements on both keyword search and ASR tasks were obtained across all languages.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130870776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Least 1-norm pole-zero modeling with sparse deconvolution for speech analysis","authors":"Liming Shi, J. Jensen, M. G. Christensen","doi":"10.1109/ICASSP.2017.7952252","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952252","url":null,"abstract":"In this paper, we present a speech analysis method based on sparse pole-zero modeling of speech. Instead of using the all-pole model to approximate the speech production filter, a pole-zero model is used for the combined effect of the vocal tract; radiation at the lips and the glottal pulse shape. Moreover, to consider the spiky excitation form of the pulse train during voiced speech, the modeling parameters and sparse residuals are estimated in an iterative fashion using a least 1-norm pole-zero with sparse deconvolution algorithm. Compared with the conventional two-stage least squares pole-zero, linear prediction and sparse linear prediction methods, experimental results show that the proposed speech analysis method has lower spectral distortion, higher reconstruction SNR and sparser residuals.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124159407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Balanced sensor management across multiple time instances via l-1/l-infinity norm minimization","authors":"Cristian Rusu, J. Thompson, N. Robertson","doi":"10.1109/ICASSP.2017.7952769","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952769","url":null,"abstract":"In this paper, we propose a solution to the sensor management problem over multiple time instances that balances the accuracy of the sensor network estimation with its utilization. We show how this problem reduces to a binary optimization problem for which we give a convex relaxation based solution that involves the minimization of a regularized ℓ∞ reweighted ℓ1 norm. We show experimentally the behavior of the proposed algorithm and compare it with previous methods from the literature.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130085035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PPG-based heart rate estimation using Wiener filter, phase vocoder and Viterbi decoding","authors":"A. Temko","doi":"10.1109/ICASSP.2017.7952309","DOIUrl":"https://doi.org/10.1109/ICASSP.2017.7952309","url":null,"abstract":"Accurate heart rate (HR) estimation from the photoplethysmographic (PPG) signal during intensive physical exercises is tackled in this paper. Wiener filters are designed to attenuate the influence of motion artifacts. The phase vocoder is used to improve the initial Discrete Fourier transform (DFT) based frequency estimation. Additionally, Viterbi decoding is used as a novel post-processing step to find the path through time-frequency state-space plane. The system performance is assessed on a publically available dataset of 23 PPG recordings. The resulting algorithm is designed for scenarios that do not require online HR monitoring (swimming, offline fitness statistics). The resultant system with an error rate of 1.31 beats per minute outperforms all other systems reported to-date in literature and in contrast to existing alternatives requires no parameter to tune at the post-processing stage and operates at a much lower computational cost. The Matlab implementation is provided online.","PeriodicalId":118243,"journal":{"name":"2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128156543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}