Saori Takeyama, Tatsuya Komatsu, Koichi Miyazaki, M. Togami, Shunsuke Ono
{"title":"Robust Acoustic Scene Classification to Multiple Devices Using Maximum Classifier Discrepancy and Knowledge Distillation","authors":"Saori Takeyama, Tatsuya Komatsu, Koichi Miyazaki, M. Togami, Shunsuke Ono","doi":"10.23919/Eusipco47968.2020.9287734","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287734","url":null,"abstract":"This paper proposes robust acoustic scene classification (ASC) to multiple devices using maximum classifier discrepancy (MCD) and knowledge distillation (KD). The proposed method employs domain adaptation to train multiple ASC models dedicated to each device and combines these multiple device-specific models using a KD technique into a multi-domain ASC model. For domain adaptation, the proposed method utilizes MCD to align class distributions that conventional DA for ASC methods have ignored. The multi-device robust ASC model is obtained by KD, combining the multiple device-specific ASC models by MCD that may have a lower performance for non-target devices. Our experiments show that the proposed MCD-based device-specific model improved ASC accuracy by at most 12.22% for target samples, and the proposed KD-based device-general model improved ASC accuracy by 2.13% on average for all devices.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"14 1","pages":"36-40"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88907855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Taishi Nakashima, Robin Scheibler, Yukoh Wakabayashi, Nobutaka Ono
{"title":"Faster independent low-rank matrix analysis with pairwise updates of demixing vectors","authors":"Taishi Nakashima, Robin Scheibler, Yukoh Wakabayashi, Nobutaka Ono","doi":"10.23919/Eusipco47968.2020.9287508","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287508","url":null,"abstract":"In this paper, we present an algorithm for independent low-rank matrix analysis (ILRMA) of three or more sources that is faster than that for conventional ILRMA. In conventional ILRMA, demixing vectors are updated one by one by the iterative projection (IP) method. The update rules of IP are derived from a system of quadratic equations obtained by differentiating the objective function of ILRMA with respect to demixing vectors. This system of quadratic equations is called hybrid exact-approximate joint diagonalization (HEAD) and no closed-form solution is known yet for three or more sources. Recently, a method that can update two demixing vectors simultaneously has been proposed for independent vector analysis. The method is derived by reducing HEAD for two sources to a generalized eigenvalue problem and solving the problem. Furthermore, the pairwise updates have recently been extended to the case of three or more sources. However, the efficacy of the pairwise updates for ILRMA has not yet been investigated. Therefore, in this work, we apply the pairwise updates of demixing vectors to ILRMA. By replacing the update rules of demixing vectors with the proposed pairwise updates, we accelerate the convergence of ILRMA. The experimental results show that the proposed method yields faster convergence and better performance than conventional ILRMA.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"40 1","pages":"301-305"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87612643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Perlmutter, N. Sissouno, A. Viswanathan, M. Iwen
{"title":"A Provably Accurate Algorithm for Recovering Compactly Supported Smooth Functions from Spectrogram Measurements","authors":"Michael Perlmutter, N. Sissouno, A. Viswanathan, M. Iwen","doi":"10.23919/Eusipco47968.2020.9287698","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287698","url":null,"abstract":"We present an algorithm which is closely related to direct phase retrieval methods that have been shown to work well empirically [1], [2] and prove that it is guaranteed to recover (up to a global phase) a large class of compactly supported smooth functions from their spectrogram measurements. As a result, we take a first step toward developing a new class of practical phaseless imaging algorithms capable of producing provably accurate images of a given sample after it is masked by just a few shifts of a fixed periodic grating.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"35 1","pages":"970-974"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90328637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Niccoló Nicodemo, Gaurav Naithani, K. Drossos, T. Virtanen, R. Saletti
{"title":"Memory Requirement Reduction of Deep Neural Networks for Field Programmable Gate Arrays Using Low-Bit Quantization of Parameters","authors":"Niccoló Nicodemo, Gaurav Naithani, K. Drossos, T. Virtanen, R. Saletti","doi":"10.23919/Eusipco47968.2020.9287739","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287739","url":null,"abstract":"Effective employment of deep neural networks (DNNs) in mobile devices and embedded systems, like field programmable gate arrays, is hampered by requirements for memory and computational power. In this paper we propose a method that employs a non-uniform fixed-point quantization and a virtual bit shift (VBS) to improve the accuracy of the quantization of the DNN weights. We evaluate our method in a speech enhancement application, where a fully connected DNN is used to predict the clean speech spectrum from the input noisy speech spectrum. A DNN is optimized, its memory requirement is calculated, and its performance is evaluated using the short-time objective intelligibility (STOI) metric. The application of the low-bit quantization leads to a 50% reduction of the DNN memory requirement while the STOI performance drops only by 2.7%.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"50 1","pages":"466-470"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86000385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PRNU-leaks: facts and remedies","authors":"F. Pérez-González, Samuel Fernández-Menduiña","doi":"10.23919/Eusipco47968.2020.9287451","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287451","url":null,"abstract":"We address the problem of information leakage from estimates of the PhotoResponse Non-Uniformity (PRNU) fingerprints of a sensor. This leakage may compromise privacy in forensic scenarios, as it may reveal information from the images used in the PRNU estimation. We propose a new way to compute the information-theoretic leakage that is based on embedding synthetic PRNUs, and presesent affordable approximations and bounds. We also propose a new compact measure for the performance in membership inference tests. Finally, we analyze two potential countermeasures against leakage: binarization, which was already used in PRNU-storage contexts, and equalization, which is novel and offers better performance. Theoretical results are validated with experiments carried out on a real-world image dataset.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"34 1","pages":"720-724"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86034936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CNN-based Note Onset Detection using Synthetic Data Augmentation","authors":"Mina Mounir, P. Karsmakers, T. Waterschoot","doi":"10.23919/Eusipco47968.2020.9287621","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287621","url":null,"abstract":"Detecting the onset of notes in music excerpts is a fundamental problem in many music signal processing tasks, including analysis, synthesis, and information retrieval. When addressing the note onset detection (NOD) problem using a data-driven methodology, a major challenge is the availability and quality of labeled datasets used for both model training/tuning and evaluation. As most of the available datasets are manually annotated, the amount of annotated music excerpts is limited and the annotation strategy and quality varies across data sets. To counter both problems, in this paper we propose to use semi-synthetic datasets where the music excerpts are mixes of isolated note recordings. The advantage resides in the annotations being automatically generated while mixing the notes, as isolated note onsets are straightforward to detect using a simple energy measure. A semi-synthetic dataset is used in this work for augmenting a real piano dataset when training a convolutional Neural Network (CNN) with three novel model training strategies. Training the CNN on a semi-synthetic dataset and retraining only the CNN classification layers on a real dataset results in higher average F1-score (F1) scores with lower variance.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"89 1","pages":"171-175"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74046757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generic Compression of Off-The-Air Radio Frequency Signals with Grouped-Bin FFT Quantisation","authors":"D. Muir, L. Crockett, R. Stewart","doi":"10.23919/Eusipco47968.2020.9287457","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287457","url":null,"abstract":"This paper studies the capabilities of a proposed lossy, grouped-bin FFT quantisation compression method for targeting Off-The-Air (OTA) Radio Frequency (RF) signals. The bins within a 512-point Fast Fourier Transform (FFT) are split into groups of adjacent bins, and these groups are each quantised separately. Additional compression can be achieved by setting groups which are not deemed to contain significant information to zero, based on a pre-defined minimum magnitude threshold. In this paper, we propose two alternative methods for quantising the remaining groups. The first of these, Grouped-bin FFT Threshold Quantisation (GFTQ), involves allocating quantisation wordlengths based on several pre-defined magnitude thresholds. The second, Grouped-bin FFT Error Quantisation (GFEQ), involves incrementing the quantisation wordlength for each group until the calculated quantisation error falls below a minimum error threshold. Both algorithms were tested for a variety of signal types, including Digital Private Mobile Radio 446 MHz (dPMR446), which was considered as a case study. While GFTQ allowed for higher Compression Ratios (CR), the compression process resulted in added quantisation noise. The GFEQ algorithm achieved lower CRs, but also lower noise levels across all test signals.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"1 1","pages":"1767-1771"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74000525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xianghui Xie, Jared Houghtaling, K. Foubert, T. Waterschoot
{"title":"Computational Approach to Track Beats in Improvisational Music Performance","authors":"Xianghui Xie, Jared Houghtaling, K. Foubert, T. Waterschoot","doi":"10.23919/Eusipco47968.2020.9287444","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287444","url":null,"abstract":"Beat tracking, or identifying the temporal locations of beats in a musical recording, has a variety of applications that range from music information retrieval to machine listening. Algorithms designed to monitor the tempo of a musical recording have thus far been optimized for music with relatively stable rhythms, repetitive structures, and consistent melodies; these algorithms typically struggle to follow the free-form nature of improvisational music. Here, we present a multi-agent improvisation beat tracker (MAIBT) that addresses the challenges posed by improvisations and compare its performance with other state-of-the-art methods on a unique data set collected during improvisational music therapy sessions. This algorithm is designed for MIDI files and proceeds in four stages: (1) preprocessing to remove notes that are timid and overlapping, (2) clustering of the remaining notes and subsequent ranking of the clusters, (3) agent initialization and performance-based selection, and (4) artificial beat insertion and deletion to fill remaining beat gaps and create a comprehensive beat sequence. This particular method performs better than other generic beat-tracking approaches for music that lacks regularity; it is thus well suited to applications where unpredictability and inaccuracy are predominant, such as in music therapy improvisation.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"11 2 1","pages":"166-170"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72731551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Jaber, A. Nasser, N. Charara, A. Mansour, K. Yao
{"title":"One-Class based learning for Hybrid Spectrum Sensing in Cognitive Radio","authors":"M. Jaber, A. Nasser, N. Charara, A. Mansour, K. Yao","doi":"10.23919/Eusipco47968.2020.9287326","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287326","url":null,"abstract":"The main aim of the Spectrum Sensing (SS) in a Cognitive Radio system is to distinguish between the binary hypotheses H0: Primary User (PU) is absent and H1: PU is active. In this paper, Machine Learning (ML)-based hybrid Spectrum Sensing (SS) scheme is proposed. The scattering of the Test Statistics (TSs) of two detectors is used in the learning and prediction phases. As the SS decision is binary, the proposed scheme requires the learning of only the boundaries of H0-class in order to make a decision on the PU status: active or idle. Thus, a set of data generated under H0 hypothesis is used to train the detection system. Accordingly, unlike the existing ML-based schemes of the literature, no PU statistical parameters are required. In order to discriminate between H0-class and elsewhere, we used a one-class classification approach that is inspired by the Isolation Forest algorithm. Extensive simulations are done in order to investigate the efficiency of such hybrid SS and the impact of the novelty detection model parameters on the detection performance. Indeed, these simulations corroborate the efficiency of the proposed one-class learning of the hybrid SS system.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"1 1","pages":"1683-1686"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77083788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Audio-Visual Speech Classification based on Absent Class Detection","authors":"G. D. Sad, J. Gómez","doi":"10.23919/Eusipco47968.2020.9287615","DOIUrl":"https://doi.org/10.23919/Eusipco47968.2020.9287615","url":null,"abstract":"In the present paper, a novel method for Audio-Visual Speech Recognition is introduced, aiming to minimize the intra-class errors. Based on a novel training procedure, the Complementary Models are introduced. These models aim to detect the absence of a class, in contrast to traditional models that aim to detect the presence of a class. In the proposed method, traditional models are employed in the first stage of a cascade scheme, and then the proposed complementary models are used to make the final decision on the recognition results. Experimental results in all the scenarios evaluated (different inputs modalities, three databases, four classifiers, and acoustic noisy conditions), show that a good performance is achieved with the proposed scheme. Also, better results than other reported methods in the literature over two public databases are achieved.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"77 1","pages":"336-340"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80987401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}