{"title":"Laplacian Regularized Tensor Low-Rank Minimization for Hyperspectral Snapshot Compressive Imaging","authors":"Yi Yang, Fei Jiang, Hongtao Lu","doi":"10.1109/ICASSP39728.2021.9413381","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413381","url":null,"abstract":"Snapshot Compressive Imaging (SCI) systems, including hyperspectral compressive imaging and video compressive imaging, are designed to depict high-dimensional signals with limited data by mapping multiple images into one. One key module of SCI systems is a high quality reconstruction algorithm for original frames. However, most existing decoding algorithms are based on vectorization representation and fail to capture the intrinsic structural information of high dimensional signals. In this paper, we propose a tensor-based low-rank reconstruction algorithm with hyper-Laplacian constraint for hyperspectral SCI systems. First, we integrate the non-local self-similarity and tensor low-rank minimization approach to explore the intrinsic structural correlations along spatial and spectral domains. Then, we introduce a hyper-Laplacian constraint to model the global spectral structures, alleviating the ringing artifacts in the spatial domain. Experimental results on hyperspectral image corpus demonstrate the proposed algorithm achieves average 0.8~2.9 dB improvement in PSNR over state-of-the-art work.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125237680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuhu Chang, Changyang He, Yingying Zhao, T. Lu, Ning Gu
{"title":"A High-Frame-Rate Eye-Tracking Framework for Mobile Devices","authors":"Yuhu Chang, Changyang He, Yingying Zhao, T. Lu, Ning Gu","doi":"10.1109/ICASSP39728.2021.9414624","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414624","url":null,"abstract":"Gaze-on-screen tracking, an appearance-based eye-tracking task, has drawn significant interest in recent years. While learning-based high-precision eye-tracking methods have been designed in the past, the complex pre-training and high computation in neural network-based deep models restrict their applicability in mobile devices. Moreover, as the display frame rate of mobile devices has steadily increased to 120 fps, high-frame-rate eye tracking becomes increasingly challenging. In this work, we tackle the tracking efficiency challenge and introduce GazeHFR, a biologic-inspired eye-tracking model specialized for mobile devices, offering both high accuracy and efficiency. Specifically, GazeHFR classifies the eye movement into two distinct phases, i.e., saccade and smooth pursuit, and leverages inter-frame motion information combined with lightweight learning models tailored to each movement phase to deliver high-efficient eye tracking without affecting accuracy. Compared to prior art, Gaze-HFR achieves approximately 7x speedup and 15% accuracy improvement on mobile devices.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130788616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predictive Coding for Lossless Dataset Compression","authors":"Madeleine Barowsky, Alexander Mariona, F. Calmon","doi":"10.1109/ICASSP39728.2021.9413447","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413447","url":null,"abstract":"Lossless compression of datasets is a problem of significant theoretical and practical interest. It appears naturally in the task of storing, sending, or archiving large collections of information for scientific research. We can greatly improve encoding bitrate if we allow the compression of the original dataset to decompress to a permutation of the data. We prove the equivalence of dataset compression to compressing a permutation-invariant structure of the data and implement such a scheme via predictive coding. We benchmark our compression procedure against state-of-the-art compression utilities on the popular machine-learning datasets MNIST and CIFAR-10 and outperform for multiple parameter sets.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131029045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-Time Speech Enhancement for Mobile Communication Based on Dual-Channel Complex Spectral Mapping","authors":"Ke Tan, Xueliang Zhang, Deliang Wang","doi":"10.1109/ICASSP39728.2021.9414346","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414346","url":null,"abstract":"Speech quality and intelligibility can be severely degraded by back-ground noise in mobile communication. In order to attenuate back-ground noise, speech enhancement systems have been integrated into mobile phones, and a microphone array is typically deployed to improve the enhancement performance. This paper proposes a novel approach to real-time speech enhancement for dual-microphone mobile phones. Our approach employs a causal densely-connected convolutional recurrent network to perform dual-channel complex spectral mapping. We apply a structured pruning technique for compressing the model without significantly affecting the enhancement performance. This leads to a real-time enhancement system for on-device processing. Evaluation results show that the pro-posed approach substantially advances the performance of an earlier approach to dual-channel speech enhancement for mobile communication.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131078354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Short Tutorial on The Weisfeiler-Lehman Test And Its Variants","authors":"Ningyuan Huang, Soledad Villar","doi":"10.1109/ICASSP39728.2021.9413523","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413523","url":null,"abstract":"Graph neural networks are designed to learn functions on graphs. Typically, the relevant target functions are invariant with respect to actions by permutations. Therefore the design of some graph neural network architectures has been inspired by graph-isomorphism algorithms.The classical Weisfeiler-Lehman algorithm (WL)—a graph-isomorphism test based on color refinement—became relevant to the study of graph neural networks. The WL test can be generalized to a hierarchy of higher-order tests, known as k-WL. This hierarchy has been used to characterize the expressive power of graph neural networks, and to inspire the design of graph neural network architectures.A few variants of the WL hierarchy appear in the literature. The goal of this short note is pedagogical and practical: We explain the differences between the WL and folklore-WL formulations, with pointers to existing discussions in the literature. We illuminate the differences between the formulations by visualizing an example.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133012431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anirudh Sridhar, Osman Yağan, Rashad M. Eletreby, S. Levin, J. Plotkin, H. Poor
{"title":"Leveraging A Multiple-Strain Model with Mutations in Analyzing the Spread of Covid-19","authors":"Anirudh Sridhar, Osman Yağan, Rashad M. Eletreby, S. Levin, J. Plotkin, H. Poor","doi":"10.1109/ICASSP39728.2021.9414595","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414595","url":null,"abstract":"The spread of COVID-19 has been among the most devastating events affecting the health and well-being of humans worldwide since World War II. A key scientific goal concerning COVID-19 is to develop mathematical models that help us to understand and predict its spreading behavior, as well as to provide guidelines on what can be done to limit its spread. In this paper, we discuss how our recent work on a multiple-strain spreading model with mutations can help address some key questions concerning the spread of COVID-19. We highlight the recent reports on a mutation of SARS-CoV-2 that is thought to be more transmissible than the original strain and discuss the importance of incorporating mutation and evolutionary adaptations (together with the network structure) in epidemic models. We also demonstrate how the multiple-strain transmission model can be used to assess the effectiveness of mask-wearing in limiting the spread of COVID-19. Finally, we present simulation results to demonstrate our ideas and the utility of the multiple-strain model in the context of COVID-19.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133537653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quickest Joint Detection and Classification of Faults in Statistically Periodic Processes","authors":"T. Banerjee, Smruti Padhy, A. Taha, E. John","doi":"10.1109/ICASSP39728.2021.9414101","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414101","url":null,"abstract":"An algorithm is proposed to detect and classify a change in the distribution of a stochastic process that has periodic statistical behavior. The problem is posed in the framework of independent and periodically identically distributed (i.p.i.d.) processes, a recently introduced class of processes to model statistically periodic data. It is shown that the proposed algorithm is asymptotically optimal as the rate of false alarms and the probability of misclassification goes to zero. This problem has applications in anomaly detection in traffic data, social network data, ECG data, and neural data, where periodic statistical behavior has been observed. The effectiveness of the algorithm is demonstrated by application to real and simulated data.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133539330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ashutosh Gupta, Ankur Kumar, Dhananjaya N. Gowda, Kwangyoun Kim, Sachin Singh, Shatrughan Singh, Chanwoo Kim
{"title":"Neural Utterance Confidence Measure for RNN-Transducers and Two Pass Models","authors":"Ashutosh Gupta, Ankur Kumar, Dhananjaya N. Gowda, Kwangyoun Kim, Sachin Singh, Shatrughan Singh, Chanwoo Kim","doi":"10.1109/ICASSP39728.2021.9414467","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414467","url":null,"abstract":"In this paper, we propose methods to compute confidence score on the predictions made by an end-to-end speech recognition model in a 2-pass framework. We use RNN-Transducer for a streaming model, and an attention-based decoder for the second pass model. We use neural technique to compute the confidence score, and experiment with various combinations of features from RNN-Transducer and second pass models. The neural confidence score model is trained as a binary classification task to accept or reject a prediction made by speech recognition model. The model is evaluated in a distributed speech recognition environment, and performs significantly better when features from second pass model are used as compared to the features from streaming model.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132282925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low-Latency Polar Decoder Using Overlapped SCL Processing","authors":"D. Kam, B. Y. Kong, Youngjoo Lee","doi":"10.1109/ICASSP39728.2021.9414326","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414326","url":null,"abstract":"In this paper, we present a novel scheduling method that reduces the latency of polar decoders significantly. Unlike the prior pruning-based successive cancellation list (SCL) decoding that suffers from a number of idle cycles, the proposed overlapped SCL scheme immediately begins node operations without waiting for the list to be sorted, being exempt from such unfavorable cycles. All possible candidates for the next node operations are precomputed in parallel with the pruning operations, and are readily selected to minimize the latency. For the 5G New Radio systems, the proposed method shortens the decoding latency of the state-of-the-art approaches by up to 22% without degrading the error-correcting performance.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132669720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Generative Demixing: Error Bounds for Demixing Subgaussian Mixtures of Lipschitz Signals","authors":"Aaron Berk","doi":"10.1109/ICASSP39728.2021.9413573","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413573","url":null,"abstract":"Generative neural networks (GNNs) have gained renown for efficaciously capturing intrinsic low-dimensional structure in natural images. Here, we investigate the subgaussian demixing problem for two Lipschitz signals, with GNN demixing as a special case. In demixing, one seeks identification of two signals given their sum and prior structural information. Here, we assume each signal lies in the range of a Lipschitz function, which includes many popular GNNs as a special case. We prove a sample complexity bound for nearly optimal recovery error that extends a recent result of Bora, et al. (2017) from the compressed sensing setting with gaussian matrices to demixing with subgaussian ones. Under a linear signal model in which the signals lie in convex sets, McCoy & Tropp (2014) have characterized the sample complexity for identification under subgaussian mixing. In the present setting, the signal structure need not be convex. For example, our result applies to a domain that is a non-convex union of convex cones. We support the efficacy of this demixing model with numerical simulations using trained GNNs, suggesting an algorithm that would be an interesting object of further theoretical study.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132751284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}