ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献_第4页

Robust Binary Loss for Multi-Category Classification with Label Noise 带有标签噪声的多类别分类鲁棒二值损失

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414493

Defu Liu, Guowu Yang, Jinzhao Wu, Jiayi Zhao, Fengmao Lv

{"title":"Robust Binary Loss for Multi-Category Classification with Label Noise","authors":"Defu Liu, Guowu Yang, Jinzhao Wu, Jiayi Zhao, Fengmao Lv","doi":"10.1109/ICASSP39728.2021.9414493","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414493","url":null,"abstract":"Deep learning has achieved tremendous success in image classification. However, the corresponding performance leap relies heavily on large-scale accurate annotations, which are usually hard to collect in reality. It is essential to explore methods that can train deep models effectively under label noise. To address the problem, we propose to train deep models with robust binary loss functions. To be specific, we tackle the K-class classification task by using K binary classifiers. We can immediately use multi-category large margin classification approaches, e.g., Pairwise-Comparison (PC) or One-Versus-All (OVA), to jointly train the binary classifiers for multi-category classification. Our method can be robust to label noise if symmetric functions, e.g., the sigmoid loss or the ramp loss, are employed as the binary loss function in the framework of risk minimization. The learning theory reveals that our method can be inherently tolerant to label noise in multi-category classification tasks. Extensive experiments over different datasets with different types of label noise are conducted. The experimental results clearly confirm the effectiveness of our method.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122701894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Privacy-Accuracy Trade-Off of Inference as Service 推理即服务的隐私-准确性权衡

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9413438

Yulu Jin, L. Lai

引用次数: 1

Deep Lung Auscultation Using Acoustic Biomarkers for Abnormal Respiratory Sound Event Detection 利用声学生物标志物检测异常呼吸声事件的深肺听诊

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414845

Upasana Tiwari, Swapnil Bhosale, Rupayan Chakraborty, S. Kopparapu

{"title":"Deep Lung Auscultation Using Acoustic Biomarkers for Abnormal Respiratory Sound Event Detection","authors":"Upasana Tiwari, Swapnil Bhosale, Rupayan Chakraborty, S. Kopparapu","doi":"10.1109/ICASSP39728.2021.9414845","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414845","url":null,"abstract":"Lung Auscultation is a non-invasive process of distinguishing normal respiratory sounds from abnormal ones by analyzing the airflow along the respiratory tract. With developments in the Deep Learning (DL) techniques and wider access to anonymized medical data, automatic detection of specific sounds such as crackles and wheezes have been gaining popularity. In this paper, we propose to use two sets of diversified acoustic biomarkers extracted using Discrete Wavelet Transform (DWT) and deep encoded features from the intermediate layer of a pre-trained Audio Event Detection (AED) model trained using sounds from daily activities. First set of biomarkers highlight the time frequency localization characteristics obtained from DWT coefficients. However, the second set of deep encoded biomarkers captures a generalized reliable representation, and thus indemnifies the scarcity of training samples and the class imbalance in dataset. The model trained using these features achieves a 15.05% increase in terms of the specificity over the baseline model that uses spectrogram features. Moreover, ensemble of DWT features and deep encoded feature based models show absolute improvements of 8.32%, 6.66% and 7.40% in terms of sensitivity, specificity and ICBHI-score, respectively, and clearly outperforms the state-of-the-art with a significant margin.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122465927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A Periodic Frame Learning Approach for Accurate Landmark Localization in M-Mode Echocardiography 一种周期框架学习方法用于m型超声心动图的准确地标定位

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414375

Yinbing Tian, Shibiao Xu, Li Guo, Fu'ze Cong

引用次数: 2

Improvements to Prosodic Alignment for Automatic Dubbing 自动配音的韵律对齐改进

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414966

Yogesh Virkar, Marcello Federico, Robert Enyedi, R. Barra-Chicote

引用次数: 16

Sparse Parameter Estimation for PMCW MIMO Radar Using Few-Bit ADCs 基于少位adc的PMCW MIMO雷达稀疏参数估计

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414267

Chao-Yi Wu, Jian Li, T. Wong

引用次数: 1

DoA estimation of a hidden RF source exploiting simple backscatter radio tags 利用简单反向散射无线电标签的隐藏射频源的DoA估计

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414918

G. Vougioukas, A. Bletsas

引用次数: 2

A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection 视听多人语音识别与主动说话人选择研究

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9414160

Otavio Braga, O. Siohan

{"title":"A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection","authors":"Otavio Braga, O. Siohan","doi":"10.1109/ICASSP39728.2021.9414160","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414160","url":null,"abstract":"Audio-visual automatic speech recognition is a promising approach to robust ASR under noisy conditions. However, up until recently it had been traditionally studied in isolation assuming the video of a single speaking face matches the audio, and selecting the active speaker at inference time when multiple people are on screen was put aside as a separate problem. As an alternative, recent work has proposed to address the two problems simultaneously with an attention mechanism, baking the speaker selection problem directly into a fully differentiable model. One interesting finding was that the attention indirectly learns the association between the audio and the speaking face even though this correspondence is never explicitly provided at training time. In the present work we further investigate this connection and examine the interplay between the two problems. With experiments involving over 50 thousand hours of public YouTube videos as training data, we first evaluate the accuracy of the attention layer on an active speaker selection task. Secondly, we show under closer scrutiny that an end-to-end model performs at least as well as a considerably larger two-step system that utilizes a hard decision boundary under various noise conditions and number of parallel face tracks.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122033490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Radio Frequency Based Heart Rate Variability Monitoring 基于射频的心率变异性监测

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9413465

Fengyu Wang, Xiaolu Zeng, Chenshu Wu, Beibei Wang, K. Liu

引用次数: 1

Arrhythmia Classification with Heartbeat-Aware Transformer 心律失常分类与心电感应变压器

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI: 10.1109/ICASSP39728.2021.9413938

Bin Wang, Chang Liu, Chuanyan Hu, Xudong Liu, Jun Cao

引用次数: 6