{"title":"Network Adaptation Strategies for Learning New Classes without Forgetting the Original Ones","authors":"Hagai Taitelbaum, Gal Chechik, J. Goldberger","doi":"10.1109/ICASSP.2019.8682848","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682848","url":null,"abstract":"We address the problem of adding new classes to an existing classifier without hurting the original classes, when no access is allowed to any sample from the original classes. This problem arises frequently since models are often shared without their training data, due to privacy and data ownership concerns. We propose an easy-to-use approach that modifies the original classifier by retraining a suitable subset of layers using a linearly-tuned, knowledge-distillation regularization. The set of layers that is tuned depends on the number of new added classes and the number of original classes.We evaluate the proposed method on two standard datasets, first in a language-identification task, then in an image classification setup. In both cases, the method achieves classification accuracy that is almost as good as that obtained by a system trained using unrestricted samples from both the original and new classes.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"78 1","pages":"3637-3641"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90247056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introducing the Orthogonal Periodic Sequences for the Identification of Functional Link Polynomial Filters","authors":"A. Carini, S. Orcioni, S. Cecchi","doi":"10.1109/ICASSP.2019.8683342","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683342","url":null,"abstract":"The paper introduces a novel family of deterministic signals, the orthogonal periodic sequences (OPSs), for the identification of functional link polynomial (FLiP) filters. The novel sequences share many of the characteristics of the perfect periodic sequences (PPSs). As the PPSs, they allow the perfect identification of a FLiP filter on a finite time interval with the cross-correlation method. In contrast to the PPSs, OPSs can identify also non-orthogonal FLiP filters, as the Volterra filters. With OPSs, the input sequence can have any persistently exciting distribution and can also be a quantized sequence. OPSs can often identify FLiP filters with a sequence period and a computational complexity much smaller than that of PPSs. Several results are reported to show the effectiveness of the proposed sequences identifying a real nonlinear audio system.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"65 1","pages":"5486-5490"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90268114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance Analysis of Convex Data Detection in MIMO","authors":"Ehsan Abbasi, Fariborz Salehi, B. Hassibi","doi":"10.1109/ICASSP.2019.8683890","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683890","url":null,"abstract":"We study the performance of a convex data detection method in large multiple-input multiple-output (MIMO) systems. The goal is to recover an n-dimensional complex signal whose entries are from an arbitrary constellation $mathcal{D} subset mathbb{C}$, using m noisy linear measurements. Since the Maximum Likelihood (ML) estimation involves minimizing a loss function over the discrete set ${mathcal{D}^n}$, it becomes computationally intractable for large n. One approach is to relax to a $mathcal{D}$ convex set and to utilize convex programing to solve the problem precise and then to map the answer to the closest point in the set $mathcal{D}$. We assume an i.i.d. complex Gaussian channel matrix and derive expressions for the symbol error probability of the proposed convex method in the limit of m, n → ∞. Prior work was only able to do so for real valued constellations such as BPSK and PAM. The main contribution of this paper is to extend the results to complex valued constellations. In particular, we use our main theorem to calculate the performance of the complex algorithm for PSK and QAM constellations. In addition, we introduce a closed-form formula for the symbol error probability in the high-SNR regime and determine the minimum number of measurements m required for consistent signal recovery.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"66 1","pages":"4554-4558"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90291360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Filipe M. Lins, M. Johann, Emmanouil Benetos, Rodrigo Schramm
{"title":"Automatic Transcription of Diatonic Harmonica Recordings","authors":"Filipe M. Lins, M. Johann, Emmanouil Benetos, Rodrigo Schramm","doi":"10.1109/ICASSP.2019.8682334","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682334","url":null,"abstract":"This paper presents a method for automatic transcription of the diatonic Harmonica instrument. It estimates the multi-pitch activations through a spectrogram factorisation framework. This framework is based on Probabilistic Latent Component Analysis (PLCA) and uses a fixed 4-dimensional dictionary with spectral templates extracted from Harmonica’s instrument timbre. Methods based on spectrogram factorisation may suffer from local-optima issues in the presence of harmonic overlap or considerable timbre variability. To alleviate this issue, we propose a set of harmonic constraints that are inherent to the Harmonica instrument note layout or are caused by specific diatonic Harmonica playing techniques. These constraints help to guide the factorisation process until convergence into meaningful multi-pitch activations is achieved. This work also builds a new audio dataset containing solo recordings of diatonic Harmonica excerpts and the respective multi-pitch annotations. We compare our proposed approach against multiple baseline techniques for automatic music transcription on this dataset and report the results based on frame-based F-measure statistics.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"29 1","pages":"256-260"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90299941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junwen Bai, Zihang Lai, Runzhe Yang, Yexiang Xue, J. Gregoire, C. Gomes
{"title":"Imitation Refinement for X-ray Diffraction Signal Processing","authors":"Junwen Bai, Zihang Lai, Runzhe Yang, Yexiang Xue, J. Gregoire, C. Gomes","doi":"10.1109/ICASSP.2019.8683723","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683723","url":null,"abstract":"Many real-world tasks involve identifying signals from data satisfying background or prior knowledge. In domains like materials discovery, due to the flaws and biases in raw experimental data, the identification of X-ray diffraction (XRD) signals often requires significant (manual) expert work to find refined signals that are similar to the ideal theoretical ones. Automatically refining the raw XRD signals utilizing simulated theoretical data is thus desirable. We propose imitation refinement, a novel approach to refine imperfect input signals, guided by a pre-trained classifier incorporating prior knowledge from simulated theoretical data, such that the refined signals imitate the ideal ones. The classifier is trained on the ideal simulated data to classify signals and learns an embedding space where each class is represented by a prototype. The refiner learns to refine the imperfect signals with small modifications, such that their embeddings are closer to the corresponding prototypes. We show that the refiner can be trained in both supervised and unsupervised fashions. We further illustrate the effectiveness of the proposed approach both qualitatively and quantitatively in an X-ray diffraction signal refinement task in materials discovery.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"46 1","pages":"3337-3341"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90311979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lukas Stappen, N. Cummins, Eva-Maria Messner, H. Baumeister, J. Dineley, Björn Schuller
{"title":"Context Modelling Using Hierarchical Attention Networks for Sentiment and Self-assessed Emotion Detection in Spoken Narratives","authors":"Lukas Stappen, N. Cummins, Eva-Maria Messner, H. Baumeister, J. Dineley, Björn Schuller","doi":"10.1109/ICASSP.2019.8683801","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683801","url":null,"abstract":"Automatic detection of sentiment and affect in personal narratives through word usage has the potential to assist in the automated detection of change in psychotherapy. Such a tool could, for instance, provide an efficient, objective measure of the time a person has been in a positive or negative state-of-mind. Towards this goal, we propose and develop a hierarchical attention model for the tasks of sentiment (positive and negative) and self-assessed affect detection in transcripts of personal narratives. We also perform a qualitative analysis of the word attentions learnt by our sentiment analysis model. In a key result, our attention model achieved an un-weighted average recall (UAR) of 91.0 % in a binary sentiment detection task on the test partition of the Ulm State-of-Mind in Speech (USoMS) corpus. We also achieved UARs of 73.7 % and 68.6 % in the 3-class tasks of arousal and valence detection respectively. Finally, our qualitative analysis associates colloquial reinforcements with positive sentiments, and uncertain phrasing with negative sentiments.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"47 14","pages":"6680-6684"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91435895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Inoue, H. Kameoka, Li Li, Shogo Seki, S. Makino
{"title":"Joint Separation and Dereverberation of Reverberant Mixtures with Multichannel Variational Autoencoder","authors":"S. Inoue, H. Kameoka, Li Li, Shogo Seki, S. Makino","doi":"10.1109/ICASSP.2019.8683497","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683497","url":null,"abstract":"In this paper, we deal with a multichannel source separation problem under a highly reverberant condition. The multichan- nel variational autoencoder (MVAE) is a recently proposed source separation method that employs the decoder distribu- tion of a conditional VAE (CVAE) as the generative model for the complex spectrograms of the underlying source sig- nals. Although MVAE is notable in that it can significantly improve the source separation performance compared with conventional methods, its capability to separate highly rever- berant mixtures is still limited since MVAE uses an instan- taneous mixture model. To overcome this limitation, in this paper we propose extending MVAE to simultaneously solve source separation and dereverberation problems by formulat- ing the separation system as a frequency-domain convolutive mixture model. A convergence-guaranteed algorithm based on the coordinate descent method is derived for the optimiza- tion. Experimental results revealed that the proposed method outperformed the conventional methods in terms of all the source separation criteria in highly reverberant environments.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"16 1","pages":"96-100"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84952009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Look, Listen, and Learn More: Design Choices for Deep Audio Embeddings","authors":"J. Cramer, Ho-Hsiang Wu, J. Salamon, J. Bello","doi":"10.1109/ICASSP.2019.8682475","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682475","url":null,"abstract":"A considerable challenge in applying deep learning to audio classification is the scarcity of labeled data. An increasingly popular solution is to learn deep audio embeddings from large audio collections and use them to train shallow classifiers using small labeled datasets. Look, Listen, and Learn (L3-Net) is an embedding trained through self-supervised learning of audio-visual correspondence in videos as opposed to other embeddings requiring labeled data. This framework has the potential to produce powerful out-of-the-box embeddings for downstream audio classification tasks, but has a number of unexplained design choices that may impact the embeddings’ behavior. In this paper we investigate how L3-Net design choices impact the performance of downstream audio classifiers trained with these embeddings. We show that audio-informed choices of input representation are important, and that using sufficient data for training the embedding is key. Surprisingly, we find that matching the content for training the embedding to the downstream task is not beneficial. Finally, we show that our best variant of the L3-Net embedding outperforms both the VGGish and SoundNet embeddings, while having fewer parameters and being trained on less data. Our implementation of the L3-Net embedding model as well as pre-trained models are made freely available online.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"5 1","pages":"3852-3856"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85310875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Solving Quadratic Equations via Amplitude-based Nonconvex Optimization","authors":"Vincent Monardo, Yuanxin Li, Yuejie Chi","doi":"10.1109/ICASSP.2019.8682357","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682357","url":null,"abstract":"In many signal processing tasks, one seeks to recover an r-column matrix object ${mathbf{X}} in {mathbb{C}^{n times r}}$ from a set of nonnegative quadratic measurements up to orthonormal transforms. Example applications include coherence retrieval in optical imaging and covariance sketching for high-dimensional streaming data. To this end, efficient nonconvex optimization methods are quite appealing, due to their computational efficiency and scalability to large-scale problems. There is a recent surge of activities in designing nonconvex methods for the special case r = 1, known as phase retrieval; however, very little work has studied the general rank-r setting. Motivated by the success of phase retrieval, in this paper we derive several algorithms which utilize the quadratic loss function based on amplitude measurements, including (stochastic) gradient descent and alternating minimization. Numerical experiments demonstrate their computational and statistical performances, highlighting the superior performance of stochastic gradient descent with appropriate mini-batch sizes.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"57 11","pages":"5526-5530"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91399390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonlinear Acceleration of Constrained Optimization Algorithms","authors":"Vien V. Mai, M. Johansson","doi":"10.1109/ICASSP.2019.8682962","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682962","url":null,"abstract":"This paper introduces a novel technique for nonlinear acceleration of first-order methods for constrained convex optimization. Previous studies of nonlinear acceleration have only been able to provide convergence guarantees for unconstrained convex optimization. In contrast, our method is able to avoid infeasibility of the accelerated iterates and retains the theoretical performance guarantees of the unconstrained case. We focus on Anderson acceleration of the classical projected gradient descent (PGD) method, but our techniques can easily be extended to more sophisticated algorithms, such as mirror descent. Due to the presence of a constraint set, the relevant fixed-point mapping for PGD is not differentiable. However, we show that the convergence results for Anderson acceleration of smooth fixed-point iterations can be extended to the non-smooth case under certain technical conditions.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"17 1","pages":"4903-4907"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78029089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}