2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)最新文献_第10页

L3DAS21 Challenge: Machine Learning for 3D Audio Signal Processing L3DAS21挑战:3D音频信号处理的机器学习

2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2021-04-12 DOI: 10.1109/MLSP52302.2021.9596248

E. Guizzo, Riccardo F. Gramaccioni, Saeid Jamili, C. Marinoni, Edoardo Massaro, Claudia Medaglia, Giuseppe Nachira, Leonardo Nucciarelli, Ludovica Paglialunga, Marco Pennese, Sveva Pepe, Enrico Rocchi, A. Uncini, D. Comminiello

{"title":"L3DAS21 Challenge: Machine Learning for 3D Audio Signal Processing","authors":"E. Guizzo, Riccardo F. Gramaccioni, Saeid Jamili, C. Marinoni, Edoardo Massaro, Claudia Medaglia, Giuseppe Nachira, Leonardo Nucciarelli, Ludovica Paglialunga, Marco Pennese, Sveva Pepe, Enrico Rocchi, A. Uncini, D. Comminiello","doi":"10.1109/MLSP52302.2021.9596248","DOIUrl":"https://doi.org/10.1109/MLSP52302.2021.9596248","url":null,"abstract":"The L3DAS21 Challenge11www.13das.com/mlsp2021 is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus on 3D speech enhancement (SE) and 3D sound localization and detection (SELD). Alongside with the challenge, we release the L3DAS21 dataset, a 65 hours 3D audio corpus, accompanied with a Python API that facilitates the data usage and results submission stage. Usually, machine learning approaches to 3D audio tasks are based on single-perspective Ambisonics recordings or on arrays of single-capsule microphones. We propose, instead, a novel multichannel audio configuration based multiple-source and multiple-perspective Ambisonics recordings, performed with an array of two first-order Ambisonics microphones. To the best of our knowledge, it is the first time that a dualmic Ambisonics configuration is used for these tasks. We provide baseline models and results for both tasks, obtained with state-of-the-art architectures: FaSNet for SE and SELDnet for SELD.","PeriodicalId":156116,"journal":{"name":"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128141916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Evaluation of Complexity Measures for Deep Learning Generalization in Medical Image Analysis 医学图像分析中深度学习泛化的复杂性度量评价

2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2021-03-04 DOI: 10.1109/MLSP52302.2021.9596501

Aleksandar Vakanski, Min Xian

{"title":"Evaluation of Complexity Measures for Deep Learning Generalization in Medical Image Analysis","authors":"Aleksandar Vakanski, Min Xian","doi":"10.1109/MLSP52302.2021.9596501","DOIUrl":"https://doi.org/10.1109/MLSP52302.2021.9596501","url":null,"abstract":"The generalization error of deep learning models for medical image analysis often increases on images collected with different devices for data acquisition, device settings, or patient population. A better understanding of the generalization capacity on new images is crucial for clinicians' trustworthiness. Although significant efforts have been recently directed toward establishing generalization bounds and complexity measures, there is still a significant discrepancy between the predicted and actual generalization performance. As well, related large empirical studies have been primarily based on validation with general-purpose image datasets. This paper presents an empirical study that investigates the correlation between 25 complexity measures and the generalization abilities of deep learning classifiers for breast ultrasound images. The results indicate that PAC-Bayes flatness and path norm measures produce the most consistent explanation for the combination of models and data. We also report that multi-task classification and segmentation approach for breast images is conducive toward improved generalization.","PeriodicalId":156116,"journal":{"name":"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126946096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Multimodal Data Visualization and Denoising with Integrated Diffusion 集成扩散的多模态数据可视化和去噪

2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2021-02-12 DOI: 10.1109/mlsp52302.2021.9596214

Manik Kuchroo, Abhinav Godavarthi, Alexander Tong, Guy Wolf, Smita Krishnaswamy

引用次数: 9

GRAD-CAM Guided Channel-Spatial Attention Module for Fine-Grained Visual Classification 面向细粒度视觉分类的GRAD-CAM引导通道-空间注意模块

2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2021-01-24 DOI: 10.1109/mlsp52302.2021.9596481

Shuai Xu, Dongliang Chang, Jiyang Xie, Zhanyu Ma

{"title":"GRAD-CAM Guided Channel-Spatial Attention Module for Fine-Grained Visual Classification","authors":"Shuai Xu, Dongliang Chang, Jiyang Xie, Zhanyu Ma","doi":"10.1109/mlsp52302.2021.9596481","DOIUrl":"https://doi.org/10.1109/mlsp52302.2021.9596481","url":null,"abstract":"Fine-grained visual classification (FGVC) is becoming an important research field, due to its wide applications and the rapid development of computer vision technologies. The current state-of-the-art (SOTA) methods in the FGVC usually employ attention mechanisms to first capture the semantic parts and then discover their subtle differences between distinct classes. The existing attention modules have significantly improved the classification performance but they are poorly guided since part-based detectors in the FGVC depend on the network learning ability without the supervision of part annotations. As obtaining such part annotations is labor-expensive, some visual localization and explanation methods, such as gradient-weighted class activation mapping (Grad-CAM), can be utilized for supervising the attention mechanism. In this paper, we propose a Grad-CAM guided channel-spatial attention module for the FGVC, which employs the Grad-CAM to supervise and constrain the attention weights by generating the coarse localization maps. To demonstrate the effectiveness of the proposed method, we conduct comprehensive experiments on three popular FGVC datasets, including CUB-200-2011, Stanford Cars, and FGVC-Aircraft datasets. The proposed method outperforms the SOTA attention modules in the FGVC task. In addition, visualizations of the feature maps demonstrate the superiority of the proposed method against the SOTA approaches.","PeriodicalId":156116,"journal":{"name":"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126181784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Deep Variational Generative Models for Audio-Visual Speech Separation 视听语音分离的深度变分生成模型

2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2020-08-17 DOI: 10.1109/mlsp52302.2021.9596406

V. Nguyen, M. Sadeghi, E. Ricci, Xavier Alameda-Pineda

{"title":"Deep Variational Generative Models for Audio-Visual Speech Separation","authors":"V. Nguyen, M. Sadeghi, E. Ricci, Xavier Alameda-Pineda","doi":"10.1109/mlsp52302.2021.9596406","DOIUrl":"https://doi.org/10.1109/mlsp52302.2021.9596406","url":null,"abstract":"In this paper, we are interested in audio-visual speech separation given a single-channel audio recording as well as visual information (lips movements) associated with each speaker. We propose an unsupervised technique based on audio-visual generative modeling of clean speech. More specifically, during training, a latent variable generative model is learned from clean speech spectra using a variational auto-encoder (VAE). To better utilize the visual information, the posteriors of the latent variables are inferred from mixed speech (instead of clean speech) as well as the visual data. The visual modality also serves as a prior for latent variables, through a visual network. At test time, the learned generative model (both for speaker-independent and speaker-dependent scenarios) is combined with an unsupervised non-negative matrix factorization (NMF) variance model for background noise. All the latent variables and noise parameters are then estimated by a Monte Carlo expectation-maximization algorithm. Our experiments show that the proposed unsupervised VAE-based method yields better separation performance than NMF-based approaches as well as a supervised deep learning-based technique.","PeriodicalId":156116,"journal":{"name":"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126925007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

A Semi-Supervised Generative Adversarial Network for Prediction of Genetic Disease Outcomes 遗传疾病预后预测的半监督生成对抗网络

2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2020-07-02 DOI: 10.1109/mlsp52302.2021.9596351

C. Davi, U. Braga-Neto

引用次数: 1

Preferential Batch Bayesian Optimization 优先批处理贝叶斯优化

2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2020-03-25 DOI: 10.1109/mlsp52302.2021.9596494

E. Siivola, Akash Kumar Dhaka, M. R. Andersen, Javier I. González, Pablo G. Moreno, Aki Vehtari

{"title":"Preferential Batch Bayesian Optimization","authors":"E. Siivola, Akash Kumar Dhaka, M. R. Andersen, Javier I. González, Pablo G. Moreno, Aki Vehtari","doi":"10.1109/mlsp52302.2021.9596494","DOIUrl":"https://doi.org/10.1109/mlsp52302.2021.9596494","url":null,"abstract":"Most research in Bayesian optimization (BO) has focused on direct feedback scenarios, where one has access to exact values of some expensive-to-evaluate objective. This direction has been mainly driven by the use of BO in machine learning hyperparameter configuration problems. However, in domains such as modelling human preferences, A/B tests, or recommender systems, there is a need for methods that can replace direct feedback with preferential feedback, obtained via rankings or pairwise comparisons. In this work, we present preferential batch Bayesian optimization (PBBO), a new framework that allows finding the optimum of a latent function of interest, given any type of parallel preferential feedback for a group of two or more points. We do so by using a Gaussian process model with a likelihood specially designed to enable parallel and efficient data collection mechanisms, which are key in modern machine learning. We show how the acquisitions developed under this framework generalize and augment previous approaches in Bayesian optimization, expanding the use of these techniques to a wider range of domains. An extensive simulation study shows the benefits of this approach, both with simulated functions and four real data sets.","PeriodicalId":156116,"journal":{"name":"2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"207 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120899791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11