{"title":"Hypergraphs with Edge-Dependent Vertex Weights: Spectral Clustering Based on the 1-Laplacian","authors":"Yu Zhu, Boning Li, Santiago Segarra","doi":"10.1109/icassp43922.2022.9746363","DOIUrl":"https://doi.org/10.1109/icassp43922.2022.9746363","url":null,"abstract":"We propose a flexible framework for defining the 1-Laplacian of a hypergraph that incorporates edge-dependent vertex weights. These weights are able to reflect varying importance of vertices within a hyperedge, thus conferring the hypergraph model higher expressivity than homogeneous hypergraphs. We then utilize the eigenvector associated with the second smallest eigenvalue of the hypergraph 1-Laplacian to cluster the vertices. From a theoretical standpoint based on an adequately defined normalized Cheeger cut, this procedure is expected to achieve higher clustering accuracy than that based on the traditional Laplacian. Indeed, we confirm that this is the case using real-world datasets to demonstrate the effectiveness of the proposed spectral clustering approach. Moreover, we show that for a special case within our framework, the corresponding hypergraph 1-Laplacian is equivalent to the 1-Laplacian of a related graph, whose eigenvectors can be computed more efficiently, facilitating the adoption on larger datasets.","PeriodicalId":272439,"journal":{"name":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133537029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low Resources Online Single-Microphone Speech Enhancement with Harmonic Emphasis","authors":"Nir Raviv, Ofer Schwartz, S. Gannot","doi":"10.1109/icassp43922.2022.9747656","DOIUrl":"https://doi.org/10.1109/icassp43922.2022.9747656","url":null,"abstract":"In this paper, we propose a deep neural network (DNN)-based single-microphone speech enhancement algorithm characterized by a short latency and low computational resources. Many speech enhancement algorithms suffer from low noise reduction capabilities between pitch harmonics, and in severe cases, the harmonic structure may even be lost. Recognizing this drawback, we propose a new weighted loss that emphasizes pitch-dominated frequency bands. For that, we propose a method, applied only at the training stage, to detect these frequency bands. The proposed method is applied to speech signals contaminated by several noise types, and in particular, typical domestic noise drawn from ESC-50 and DE-MAND databases, demonstrating its applicability to ‘stay-at-home’ scenarios.","PeriodicalId":272439,"journal":{"name":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133542953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Y. Li, C. Papayiannis, Viktor Rozgic, Elizabeth Shriberg, Chao Wang
{"title":"Confidence Estimation for Speech Emotion Recognition Based on the Relationship Between Emotion Categories and Primitives","authors":"Y. Li, C. Papayiannis, Viktor Rozgic, Elizabeth Shriberg, Chao Wang","doi":"10.1109/ICASSP43922.2022.9746930","DOIUrl":"https://doi.org/10.1109/ICASSP43922.2022.9746930","url":null,"abstract":"Confidence estimation for Speech Emotion Recognition (SER) is instrumental in improving the reliability in the behavior of downstream applications. In this work we propose (1) a novel confidence metric for SER based on the relationship between emotion primitives: arousal, valence, and dominance (AVD) and emotion categories (ECs), (2) EmoConfidNet - a DNN trained alongside the EC recognizer to predict the proposed confidence metric, and (3) a data filtering technique used to enhance the training of EmoConfidNet and the EC recognizer. For each training sample, we calculate distances from corresponding AVD annotation vectors to centroids of each EC in the AVD space, and define EC confidences as functions of the evaluated distances. EmoConfidNet is trained to predict confidence from the same acoustic representations used to train the EC recognizer. EmoConfidNet outperforms state-of-the-art confidence estimation methods on the MSP-Podcast and IEMOCAP datasets. For a fixed EC recognizer, after we reject the same number of low confidence predictions using EmoConfidNet, we achieve a higher F1 and unweighted average recall (UAR) than when rejecting using other methods.","PeriodicalId":272439,"journal":{"name":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133566899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francois Buet-Golfouse, Hans Roggeman, Islam Utyagulov
{"title":"Robust Collaborative Learning for Sequence Modelling","authors":"Francois Buet-Golfouse, Hans Roggeman, Islam Utyagulov","doi":"10.1109/icassp43922.2022.9746494","DOIUrl":"https://doi.org/10.1109/icassp43922.2022.9746494","url":null,"abstract":"Current deep learning techniques for RNA classification suffer from over-fitting and lack of reproducibility. We show that by introducing robustness by design in both CNN and RNN algorithms, we are able to achieve standalone state-of-the-art accuracy. By constructing model-agnostic robustness checks and reusing features obtained from both architectures, we build a collaborative framework that improves performance and stability.","PeriodicalId":272439,"journal":{"name":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133271264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Randall Balestriero, Zichao Wang, Richard Baraniuk
{"title":"DeepHull: Fast Convex Hull Approximation in High Dimensions","authors":"Randall Balestriero, Zichao Wang, Richard Baraniuk","doi":"10.1109/ICASSP43922.2022.9746031","DOIUrl":"https://doi.org/10.1109/ICASSP43922.2022.9746031","url":null,"abstract":"Computing or approximating the convex hull of a dataset plays a role in a wide range of applications, including economics, statistics, and physics, to name just a few. However, convex hull computation and approximation is exponentially complex, in terms of both memory and computation, as the ambient space dimension increases. In this paper, we propose DeepHull, a new convex hull approximation algorithm based on convex deep networks (DNs) with continuous piecewise-affine nonlinearities and nonnegative weights. The idea is that binary classification between true data samples and adversarially generated samples with such a DN naturally induces a polytope decision boundary that approximates the true data convex hull. A range of exploratory experiments demonstrates that DeepHull efficiently produces a meaningful convex hull approximation, even in a high-dimensional ambient space.","PeriodicalId":272439,"journal":{"name":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133346541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yulong Wang, Xingshu Chen, Qixu Wang, Run Yang, Bangzhou Xin
{"title":"Unsupervised Anomaly Detection for Container Cloud Via BILSTM-Based Variational Auto-Encoder","authors":"Yulong Wang, Xingshu Chen, Qixu Wang, Run Yang, Bangzhou Xin","doi":"10.1109/icassp43922.2022.9747341","DOIUrl":"https://doi.org/10.1109/icassp43922.2022.9747341","url":null,"abstract":"The appearance of container technology has profoundly changed the development and deployment of multi-tier distributed applications. However, the imperfect system resource isolation features and the kernel-sharing mechanism will introduce significant security risks to the container-based cloud. In this paper, we propose a real-time unsupervised anomaly detection system for monitoring system calls in container cloud via BiLSTM-based variational auto-encoder (VAE). Our proposed BiLSTM-based VAE network leverages the generative characteristics of VAE to learn the robust representations of normal patterns by reconstruction probabilities while being sensitive to long-term dependencies. Our evaluations using real-world datasets show that the BiLSTM-based VAE network achieves excellent detection performance without introducing significant running performance overhead to the container platform.","PeriodicalId":272439,"journal":{"name":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133631090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Subject-Invariant Representations from Speech-Evoked EEG Using Variational Autoencoders","authors":"Lies Bollens, T. Francart, H. V. hamme","doi":"10.1109/ICASSP43922.2022.9747297","DOIUrl":"https://doi.org/10.1109/ICASSP43922.2022.9747297","url":null,"abstract":"The electroencephalogram (EEG) is a powerful method to understand how the brain processes speech. Linear models have recently been replaced for this purpose with deep neural networks and yield promising results. In related EEG classification fields, it is shown that explicitly modeling subject-invariant features improves generalization of models across subjects and benefits classification accuracy. In this work, we adapt factorized hierarchical variational autoencoders to exploit parallel EEG recordings of the same stimuli. We model EEG into two disentangled latent spaces. Subject accuracy reaches 98.96% and 1.60% on respectively the subject and content latent space, whereas binary content classification experiments reach an accuracy of 51.51% and 62.91% on respectively the subject and content latent space.","PeriodicalId":272439,"journal":{"name":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133668656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Controlled Sensing and Anomaly Detection Via Soft Actor-Critic Reinforcement Learning","authors":"Chen Zhong, M. C. Gursoy, Senem Velipasalar","doi":"10.1109/icassp43922.2022.9747436","DOIUrl":"https://doi.org/10.1109/icassp43922.2022.9747436","url":null,"abstract":"To address the anomaly detection problem in the presence of noisy observations and to tackle the tuning and efficient exploration challenges that arise in deep reinforcement learning algorithms, we in this paper propose a soft actor-critic deep reinforcement learning framework. To evaluate the proposed framework, we measure its performance in terms of detection accuracy, stopping time, and the total number of samples needed for detection. Via simulation results, we demonstrate the performance when soft actor-critic algorithms are employed, and identify the impact of key parameters, such as the sensing cost, on the performance. In all results, we further provide comparisons between the performances of the proposed soft actor-critic and conventional actor-critic algorithms.","PeriodicalId":272439,"journal":{"name":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"163 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132137080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Physical Layer Anonymous Communications: An Anonymity Entropy Oriented Precoding Design (Invited Paper)","authors":"Zhongxiang Wei, C. Masouros, Sumei Sun","doi":"10.1109/icassp43922.2022.9746100","DOIUrl":"https://doi.org/10.1109/icassp43922.2022.9746100","url":null,"abstract":"Different from traditional security-oriented designs, the aim of anonymizing techniques is to mask users' identities during communication, thereby providing users with unidentifiability and unlinkability. The existing anonymizing techniques are only designated at upper layers of networks, ignoring the risk of anonymity leakage at physical layer (PHY). In this paper, we address the PHY anonymity design with focus on a typical uplink scenario where the receiver is equipped with more antennas than the sender. With the increased degrees-of-freedom at the receiver side, we first propose a maximum likelihood estimation (MLE) signal trace-back detector, which only analyzes the signaling pattern of the received signal to disclose the sender's identity. Accordingly, an anonymity entropy anonymous (AEA) precoder is proposed, which manipulates the transmitted signalling pattern to counteract the receiver's trace-back detector and meanwhile to guarantee high receive signal-to-interference-plus-noise ratio for communication. More importantly, more data streams can be multiplexed than the number of transmit antennas, which is particularly suitable for the strong receiver configuration. Simulation demonstrates that the proposed AEA precoder can simultaneously provide high anonymity and communication performance.","PeriodicalId":272439,"journal":{"name":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132270137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised Hierarchical Translation-Based Model for Multi-Modal Medical Image Registration","authors":"X. Dai, Tai Ma, Haibin Cai, Ying Wen","doi":"10.1109/ICASSP43922.2022.9746324","DOIUrl":"https://doi.org/10.1109/ICASSP43922.2022.9746324","url":null,"abstract":"Deformable registration of multi-modal medical images is a challenging task in medical image processing due to the differences in both appearance and structure. We propose an unsupervised hierarchical translation-based model to perform a coarse to fine registration of multi-modal medical images. The proposed model consists of three parts: a coarse registration network, a modal translation network and a fine registration network. First, the coarse registration network learns to obtain the coarse deformation field, which is applied as structure-preserving information to generate a translated image by the modal translation network. Then, the translated image as enhancing information combined with the original images are used to derive a fine deformation field in the fine registration network. Furthermore, the final deformation field is composed from the coarse and the fine deformation fields. In this way, the proposed model can learn high accurate deformation field to implement multi-modal medical image registration. Experiments on two multi-modal brain image datasets demonstrate the effectiveness of this model.","PeriodicalId":272439,"journal":{"name":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132275886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}