{"title":"A pattern recognition approach based on electrodermal response for pathological mood identification in bipolar disorders","authors":"A. Lanatà, A. Greco, G. Valenza, E. Scilingo","doi":"10.1109/ICASSP.2014.6854272","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854272","url":null,"abstract":"This paper reports on results of a pattern recognition technique for classifying pathological mental states of bipolar disorders using information gathered from the electrodermal response. The rationale behind this work is that the autonomic nervous system dynamics, non-invasively quantified through the electrodermal response processing, is altered by the specific mood state. Starting from the hypothesis that bipolar disorders are associated with affective dysfunctions, we processed data gathered from four bipolar patients through eleven experimental trials while an ad-hoc emotional stimulation is administered. Intra- and inter-subject variability were investigated. We show that, using a deconvolution-based approach to estimate sympathetic ANS markers and simple k-Nearest Neighbor algorithms, the proposed methodology is able to discern up to three mood states such as depression, hypo-mania, and euthymia with an average intra-subject accuracy greater than 98% and inter-subject accuracy greater than 82%.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"24 1","pages":"3601-3605"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73565082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Edjekouane, C. Plapous, C. Quinquis, S. Meunier
{"title":"Speech and audio loudness depending on telephone audio bandwidth and codec — A subjective testing approach","authors":"I. Edjekouane, C. Plapous, C. Quinquis, S. Meunier","doi":"10.1109/ICASSP.2014.6853812","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6853812","url":null,"abstract":"In this paper, we propose a new approach for the subjective assessment of the loudness of complex audio signals such as speech or music. This two-stage approach makes it possible to study the influence on loudness of the frequency bandwidth and of different kinds of codecs. In the first stage, the individual loudness function of each subject is estimated using a specific 100-point response scale. In the second stage, the subject evaluates the loudness of each processed sample, by filtering or coding/decoding, using the same scale. The loudness obtained in terms of points is then converted in loudness levels in terms of phons using the estimated individual loudness function. Results show that loudness increases with the bandwidth extension up to super-wideband. Similar behavior is observed when codecs are applied.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"18 1","pages":"1325-1329"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73662599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel cepstral representation for timbre modeling of sound sources in polyphonic mixtures","authors":"Z. Duan, Bryan Pardo, L. Daudet","doi":"10.1109/ICASSP.2014.6855057","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6855057","url":null,"abstract":"We propose a novel cepstral representation called the uniform discrete cepstrum (UDC) to represent the timbre of sound sources in a sound mixture. Different from ordinary cepstrum and MFCC which have to be calculated from the full magnitude spectrum of a source after source separation, UDC can be calculated directly from isolated spectral points that are likely to belong to the source in the mixture spectrum (e.g., non-overlapping harmonics of a harmonic source). Existing cepstral representations that have this property are discrete cepstrum and regularized discrete cepstrum, however, compared to the proposed UDC, they are not as effective and are more complex to compute. The key advantage of UDC is that it uses a more natural and locally adaptive regularizer to prevent it from overfitting the isolated spectral points. We derive the mathematical relations between these cepstral representations, and compare their timbre modeling performances in the task of instrument recognition in polyphonic audio mixtures. We show that UDC and its mel-scale variant MUDC significantly outperform all the other representations.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"72 1","pages":"7495-7499"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74061692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Julien van Hout, L. Ferrer, D. Vergyri, N. Scheffer, Yun Lei, V. Mitra, S. Wegmann
{"title":"Calibration and multiple system fusion for spoken term detection using linear logistic regression","authors":"Julien van Hout, L. Ferrer, D. Vergyri, N. Scheffer, Yun Lei, V. Mitra, S. Wegmann","doi":"10.1109/ICASSP.2014.6854985","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854985","url":null,"abstract":"State-of-the-art calibration and fusion approaches for spoken term detection (STD) systems currently rely on a multi-pass approach where the scores are calibrated, then fused, and finally re-calibrated to obtain a single decision threshold across keywords. While the above techniques are theoretically correct, they rely on meta-parameter tuning and are prone to over-fitting. This study presents an efficient and effective score calibration technique for keyword detection that is based on the logistic regression calibration approach commonly used in forensic speaker identification. The technique applies seamlessly to both single systems and to system fusion, and enables optimization for specific keyword detection evaluation functions. We run experiments on a Vietnamese STD task, comparing the technique with more empirical calibration and fusion schemes and demonstrate that we can achieve comparable or better performance in terms of the NIST ATWV metric with a more elegant solution.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"72 1","pages":"7138-7142"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74073498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Manatakis, Michael G. Nennes, I. Bakas, E. Manolakos
{"title":"Simulation-driven emulation of collaborative algorithms to assess their requirements for a large-scale WSN implementation","authors":"D. Manatakis, Michael G. Nennes, I. Bakas, E. Manolakos","doi":"10.1109/ICASSP.2014.6855232","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6855232","url":null,"abstract":"Assessing how the performance of a decentralized wireless sensor network (WSN) algorithm's implementation scales, in terms of communication and energy costs, as the network size increases is an essential requirement before its field deployment. Simulations are commonly used for this purpose, especially for large-scale environmental monitoring applications. However, it is difficult to evaluate energy consumption, processing and memory requirements before the algorithm is really ported to a real WSN platform. We propose a method for emulating the operation of collaborative algorithms in large-scale WSNs by re-using a small number of available real sensor nodes. We demonstrate the potential of the proposed simulation-driven WSN emulation approach by using it to estimate how communication and energy costs scale with the network's size when implementing a collaborative algorithm we developed in [12] for tracking the spatiotemporal evolution of a progressing environmental hazard.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"61 1","pages":"8360-8364"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74298441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, G. Saon, B. Ramabhadran
{"title":"Improvements to filterbank and delta learning within a deep neural network framework","authors":"Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, G. Saon, B. Ramabhadran","doi":"10.1109/ICASSP.2014.6854925","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854925","url":null,"abstract":"Many features used in speech recognition tasks are hand-crafted and are not always related to the objective at hand, that is minimizing word error rate. Recently, we showed that replacing a perceptually motivated mel-filter bank with a filter bank layer that is learned jointly with the rest of a deep neural network was promising. In this paper, we extend filter learning to a speaker-adapted, state-of-the-art system. First, we incorporate delta learning into the filter learning framework. Second, we incorporate various speaker adaptation techniques, including VTLN warping and speaker identity features. On a 50-hour English Broadcast News task, we show that we can achieve a 5% relative improvement in word error rate (WER) using the filter and delta learning, compared to having a fixed set of filters and deltas. Furthermore, after speaker adaptation, we find that filter and delta learning allows for a 3% relative improvement in WER compared to a state-of-the-art CNN.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"54 1","pages":"6839-6843"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75775499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Frequency-shift filtering for OFDM recovery in narrowband power line communications","authors":"Nir Shlezinger, R. Dabora","doi":"10.1109/ICASSP.2014.6855173","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6855173","url":null,"abstract":"Power line communications (PLC) has been drawing considerable interest in recent years due to the growing interest in smart grid implementation. In smart grids, network control and grid applications are allocated the frequency band of 0-500 kHz, commonly referred to as the narrowband PLC channel. This channel is characterized by strong periodic noise and low signal to noise ratio (SNR). In this work we propose a receiver which uses frequency shift filtering to exploit the cyclostationary properties of both the narrowband PLC noise, as well as the information signal, digitally modulated using orthogonal frequency division multiplexing. The results show that the new receiver obtains a substantial performance gain over previously proposed receivers, without requiring any coordination with the transmitter.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"29 1","pages":"8073-8077"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75794418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reconstruction of sparse signals from highly corrupted measurements by nonconvex minimization","authors":"Marko Filipovic","doi":"10.1109/ICASSP.2014.6854230","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854230","url":null,"abstract":"We propose a method for signal recovery in compressed sensing when measurements can be highly corrupted. It is based on ℓ<sub>p</sub> minimization for 0 <; p ≤ 1. Since it was shown that ℓ<sub>p</sub> minimization performs better than ℓ<sub>1</sub> minimization when there are no large errors, the proposed approach is a natural extension to compressed sensing with corruptions. We provide a theoretical justification of this idea, based on analogous reasoning as in the case when measurements are not corrupted by large errors. Better performance of the proposed approach compared to ℓ<sub>1</sub> minimization is illustrated in numerical experiments.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"28 1","pages":"3395-3399"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74659349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A computationally efficient calibration algorithm for the LOFAR radio astronomical array","authors":"Yuntao Wu, Amir Leshem, S. Wijnholds","doi":"10.1109/ICASSP.2014.6854635","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854635","url":null,"abstract":"In this paper, the problem of self-calibration for large astronomical arrays such as the Dutch Low Frequency Array (LOFAR) is considered. We assume direction dependent gain and phase errors which need to be estimated and calibrated out. Combining the subspace fitting and least square approaches, the signal subspace of the received single short-term interval (STI) sample data of the LOFAR is used to build a cost function whose minimizer is a statistically efficient estimator of the unknown parameters-the gains and phases of the telescopes. Subsequently, an iterative algorithm for finding the minimum of the cost function is presented and the unknown calibration parameters of both the core stations and the external subarray are separated. As a result, the computational complexity of the proposed method is significantly reduced compared to the existing methods based on a direct covariance fitting. Finally, the performance of the proposed method is compared with the conventional peeling method in computer simulation. An example for calibrating the core of the LOFAR array on Cyg A is also provided.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"13 1","pages":"5402-5406"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74875531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Retrieving the syntactic structure of erroneous ASR transcriptions for open-domain Spoken Language Understanding","authors":"Frédéric Béchet, Benoit Favre, Alexis Nasr, Mathieu Morey","doi":"10.1109/ICASSP.2014.6854372","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854372","url":null,"abstract":"Retrieving the syntactic structure of erroneous ASR transcriptions can be of great interest for open-domain Spoken Language Understanding tasks in order to correct or at least reduce the impact of ASR errors on final applications. Most of the previous works on ASR and syntactic parsing have addressed this problem by using syntactic features during ASR to help reducing Word Error Rate (WER). The improvement obtained is often rather small, however the structure and the relations between words obtained through parsing can be of great interest for the SLU processes, even without a significant decrease of WER. That is why we adopt another point of view in this paper: considering that ASR transcriptions contain inevitably some errors, we show in this study that it is possible to improve the syntactic analysis of these erroneous transcriptions by performing a joint error detection / syntactic parsing process. The applicative framework used in this study is a speech-to-speech system developed through the DARPA BOLT project.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"10 1","pages":"4097-4101"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73007246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}