ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Robust Binary Loss for Multi-Category Classification with Label Noise 带有标签噪声的多类别分类鲁棒二值损失
Defu Liu, Guowu Yang, Jinzhao Wu, Jiayi Zhao, Fengmao Lv
{"title":"Robust Binary Loss for Multi-Category Classification with Label Noise","authors":"Defu Liu, Guowu Yang, Jinzhao Wu, Jiayi Zhao, Fengmao Lv","doi":"10.1109/ICASSP39728.2021.9414493","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414493","url":null,"abstract":"Deep learning has achieved tremendous success in image classification. However, the corresponding performance leap relies heavily on large-scale accurate annotations, which are usually hard to collect in reality. It is essential to explore methods that can train deep models effectively under label noise. To address the problem, we propose to train deep models with robust binary loss functions. To be specific, we tackle the K-class classification task by using K binary classifiers. We can immediately use multi-category large margin classification approaches, e.g., Pairwise-Comparison (PC) or One-Versus-All (OVA), to jointly train the binary classifiers for multi-category classification. Our method can be robust to label noise if symmetric functions, e.g., the sigmoid loss or the ramp loss, are employed as the binary loss function in the framework of risk minimization. The learning theory reveals that our method can be inherently tolerant to label noise in multi-category classification tasks. Extensive experiments over different datasets with different types of label noise are conducted. The experimental results clearly confirm the effectiveness of our method.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122701894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Privacy-Accuracy Trade-Off of Inference as Service 推理即服务的隐私-准确性权衡
Yulu Jin, L. Lai
{"title":"Privacy-Accuracy Trade-Off of Inference as Service","authors":"Yulu Jin, L. Lai","doi":"10.1109/ICASSP39728.2021.9413438","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413438","url":null,"abstract":"In this paper, we propose a general framework to provide a desirable trade-off between inference accuracy and privacy protection in the inference as service scenario. Instead of sending data directly to the server, the user will preprocess the data through a privacy-preserving mapping, which will increase privacy protection but reduce inference accuracy. To properly address the trade-off between privacy protection and inference accuracy, we formulate an optimization problem to find the optimal privacy-preserving mapping. Even though the problem is non-convex in general, we characterize nice structures of the problem and develop an iterative algorithm to find the desired privacy-preserving mapping.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122733087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Deep Lung Auscultation Using Acoustic Biomarkers for Abnormal Respiratory Sound Event Detection 利用声学生物标志物检测异常呼吸声事件的深肺听诊
Upasana Tiwari, Swapnil Bhosale, Rupayan Chakraborty, S. Kopparapu
{"title":"Deep Lung Auscultation Using Acoustic Biomarkers for Abnormal Respiratory Sound Event Detection","authors":"Upasana Tiwari, Swapnil Bhosale, Rupayan Chakraborty, S. Kopparapu","doi":"10.1109/ICASSP39728.2021.9414845","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414845","url":null,"abstract":"Lung Auscultation is a non-invasive process of distinguishing normal respiratory sounds from abnormal ones by analyzing the airflow along the respiratory tract. With developments in the Deep Learning (DL) techniques and wider access to anonymized medical data, automatic detection of specific sounds such as crackles and wheezes have been gaining popularity. In this paper, we propose to use two sets of diversified acoustic biomarkers extracted using Discrete Wavelet Transform (DWT) and deep encoded features from the intermediate layer of a pre-trained Audio Event Detection (AED) model trained using sounds from daily activities. First set of biomarkers highlight the time frequency localization characteristics obtained from DWT coefficients. However, the second set of deep encoded biomarkers captures a generalized reliable representation, and thus indemnifies the scarcity of training samples and the class imbalance in dataset. The model trained using these features achieves a 15.05% increase in terms of the specificity over the baseline model that uses spectrogram features. Moreover, ensemble of DWT features and deep encoded feature based models show absolute improvements of 8.32%, 6.66% and 7.40% in terms of sensitivity, specificity and ICBHI-score, respectively, and clearly outperforms the state-of-the-art with a significant margin.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122465927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Periodic Frame Learning Approach for Accurate Landmark Localization in M-Mode Echocardiography 一种周期框架学习方法用于m型超声心动图的准确地标定位
Yinbing Tian, Shibiao Xu, Li Guo, Fu'ze Cong
{"title":"A Periodic Frame Learning Approach for Accurate Landmark Localization in M-Mode Echocardiography","authors":"Yinbing Tian, Shibiao Xu, Li Guo, Fu'ze Cong","doi":"10.1109/ICASSP39728.2021.9414375","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414375","url":null,"abstract":"Anatomical landmark localization has been a key challenge for medical image analysis. Existing researches mostly adopt CNN as the main architecture for landmark localization while they are not applicable to process image modalities with periodic structure. In this paper, we propose a novel two-stage frame-level detection and heatmap regression model for accurate landmark localization in m-mode echocardiography, which promotes better integration between global context information and local appearance. Specifically, a periodic frame detection module with LSTM is designed to model periodic context and detect frames of systole and diastole from original echocardiography. Next, a CNN based heatmap regression model is introduced to predict landmark localization in each systolic or diastolic local region. Experiment results show that the proposed model achieves average distance error of 9.31, which is at a reduction by 24% comparing to baseline models.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122477849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Improvements to Prosodic Alignment for Automatic Dubbing 自动配音的韵律对齐改进
Yogesh Virkar, Marcello Federico, Robert Enyedi, R. Barra-Chicote
{"title":"Improvements to Prosodic Alignment for Automatic Dubbing","authors":"Yogesh Virkar, Marcello Federico, Robert Enyedi, R. Barra-Chicote","doi":"10.1109/ICASSP39728.2021.9414966","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414966","url":null,"abstract":"Automatic dubbing is an extension of speech-to-speech translation such that the resulting target speech is carefully aligned in terms of duration, lip movements, timbre, emotion, prosody, etc. of the speaker in order to achieve audiovisual coherence. Dubbing quality strongly depends on isochrony, i.e., arranging the translation of the original speech to optimally match its sequence of phrases and pauses. To this end, we present improvements to the prosodic alignment component of our recently introduced dubbing architecture. We present empirical results for four dubbing directions – English to French, Italian, German and Spanish – on a publicly available collection of TED Talks. Compared to previous work, our enhanced prosodic alignment model significantly improves prosodic alignment accuracy and provides segmentation perceptibly better or on par with manually annotated reference segmentation.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114521812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Sparse Parameter Estimation for PMCW MIMO Radar Using Few-Bit ADCs 基于少位adc的PMCW MIMO雷达稀疏参数估计
Chao-Yi Wu, Jian Li, T. Wong
{"title":"Sparse Parameter Estimation for PMCW MIMO Radar Using Few-Bit ADCs","authors":"Chao-Yi Wu, Jian Li, T. Wong","doi":"10.1109/ICASSP39728.2021.9414267","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414267","url":null,"abstract":"In this work, we consider target parameter estimation of phase-modulated continuous-wave (PMCW) multiple-input multiple-output (MIMO) radars with few-bit analog-to-digital converters (ADCs). We formulate the estimation problem as a sparse signal recovery problem and modify the fast iterative shrinkage-thresholding algorithm (FISTA) to solve it. The ℓ2,1-norm is adopted to promote the sparsity in the range-Doppler-angle domain. Simulation results show that using few-bit ADCs can achieve comparable performance to many-bit ADCs when targets are widely separated. However, if targets are spaced closely, performance losses can occur when 1-bit ADCs are applied.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114574315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
DoA estimation of a hidden RF source exploiting simple backscatter radio tags 利用简单反向散射无线电标签的隐藏射频源的DoA估计
G. Vougioukas, A. Bletsas
{"title":"DoA estimation of a hidden RF source exploiting simple backscatter radio tags","authors":"G. Vougioukas, A. Bletsas","doi":"10.1109/ICASSP39728.2021.9414918","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414918","url":null,"abstract":"Conventional direction of arrival (DoA) techniques employ multi-antenna receivers with increased complexity and cost. This work emulates a multi-antenna system using a singleantenna receiver and exploiting the beauty and simplicity of backscatter radio. More specifically, a number of simple backscatter radio tags offer copies of the hidden RF source, relayed in space and shifted in frequency, while requiring minimal time-synchronisation. DoA of a hidden RF source was estimated with an error of less than 5 degrees, exploiting a small number of simple, ultra-low-cost backscattering tags.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121972477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection 视听多人语音识别与主动说话人选择研究
Otavio Braga, O. Siohan
{"title":"A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection","authors":"Otavio Braga, O. Siohan","doi":"10.1109/ICASSP39728.2021.9414160","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414160","url":null,"abstract":"Audio-visual automatic speech recognition is a promising approach to robust ASR under noisy conditions. However, up until recently it had been traditionally studied in isolation assuming the video of a single speaking face matches the audio, and selecting the active speaker at inference time when multiple people are on screen was put aside as a separate problem. As an alternative, recent work has proposed to address the two problems simultaneously with an attention mechanism, baking the speaker selection problem directly into a fully differentiable model. One interesting finding was that the attention indirectly learns the association between the audio and the speaking face even though this correspondence is never explicitly provided at training time. In the present work we further investigate this connection and examine the interplay between the two problems. With experiments involving over 50 thousand hours of public YouTube videos as training data, we first evaluate the accuracy of the attention layer on an active speaker selection task. Secondly, we show under closer scrutiny that an end-to-end model performs at least as well as a considerably larger two-step system that utilizes a hard decision boundary under various noise conditions and number of parallel face tracks.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122033490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Radio Frequency Based Heart Rate Variability Monitoring 基于射频的心率变异性监测
Fengyu Wang, Xiaolu Zeng, Chenshu Wu, Beibei Wang, K. Liu
{"title":"Radio Frequency Based Heart Rate Variability Monitoring","authors":"Fengyu Wang, Xiaolu Zeng, Chenshu Wu, Beibei Wang, K. Liu","doi":"10.1109/ICASSP39728.2021.9413465","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413465","url":null,"abstract":"Heart Rate Variability (HRV), which measures the fluctuation of heartbeat intervals, has been considered as an important indicator for general health evaluation. In this paper, we present mmHRV, a contact-free HRV monitoring system using commercial millimeter-wave (mmWave) radio. We devise a heartbeat signal extractor, which can optimize the decomposition of the phase of the channel information modulated by the chest movement, and thus estimate the heartbeat signal. The exact time of heartbeats is estimated by finding the peak location of the heartbeat signal while the Inter-Beat Intervals (IBIs) can be further derived for evaluating the HRV metrics. Experimental results show that mmHRV can measure the HRV accurately with 3.68ms average error of mean IBI (w.r.t. 99.49% accuracy) based on the experiments over 10 participants.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122065170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Arrhythmia Classification with Heartbeat-Aware Transformer 心律失常分类与心电感应变压器
Bin Wang, Chang Liu, Chuanyan Hu, Xudong Liu, Jun Cao
{"title":"Arrhythmia Classification with Heartbeat-Aware Transformer","authors":"Bin Wang, Chang Liu, Chuanyan Hu, Xudong Liu, Jun Cao","doi":"10.1109/ICASSP39728.2021.9413938","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413938","url":null,"abstract":"Electrocardiography (ECG) is a conventional method in arrhythmia diagnosis. In this paper, we proposed a novel neural network model which treats typical heartbeat classification task as ‘Translation’ problem. By introducing Transformer structure into model, and adding heartbeat-aware attention mechanism to enhance the alignment between encoded sequence and decoded sequence, after trained with ECG database, (which are collected from 200k patients in over 2000 hospitals for more than 10 years), the validation result of independent test dataset shows that this new heartbeat-aware Transformer model can outperform classic Transformer and other sequence to sequence methods. Finally, we show that the visualization of encoder-decoder attention weights provides more interpretable information about how a Transformer make a diagnosis based on raw ECG signals, which has guiding significance in clinical diagnosis.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122142524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信