ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Adversarial Generative Distance-Based Classifier for Robust Out-of-Domain Detection 基于对抗生成距离的鲁棒域外检测分类器
Zhiyuan Zeng, Hong Xu, Keqing He, Yuanmeng Yan, Sihong Liu, Zijun Liu, Weiran Xu
{"title":"Adversarial Generative Distance-Based Classifier for Robust Out-of-Domain Detection","authors":"Zhiyuan Zeng, Hong Xu, Keqing He, Yuanmeng Yan, Sihong Liu, Zijun Liu, Weiran Xu","doi":"10.1109/ICASSP39728.2021.9413908","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413908","url":null,"abstract":"Detecting out-of-domain (OOD) intents is critical in a task-oriented dialog system. Existing methods rely heavily on extensive manually labeled OOD samples and lack robustness. In this paper, we propose an efficient adversarial attack mechanism to augment hard OOD samples and design a novel generative distance-based classifier to detect OOD samples instead of a traditional threshold-based discriminator classifier. Experiments on two public benchmark datasets show that our method can consistently outperform the baselines with a statistically significant margin.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115409791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Capturing Temporal Dependencies Through Future Prediction for CNN-Based Audio Classifiers 基于cnn的音频分类器的未来预测捕获时间依赖关系
Hongwei Song, Jiqing Han, Shiwen Deng, Zhihao Du
{"title":"Capturing Temporal Dependencies Through Future Prediction for CNN-Based Audio Classifiers","authors":"Hongwei Song, Jiqing Han, Shiwen Deng, Zhihao Du","doi":"10.1109/ICASSP39728.2021.9414018","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414018","url":null,"abstract":"This paper focuses on the problem of temporal dependency modeling in the CNN-based models for audio classification tasks. To capture audio temporal dependencies using CNNs, we take a different approach from the purely architecture-induced method and explicitly encode temporal dependencies into the CNN-based audio classifiers. More specifically, in addition to the classification objective, we require the CNN model to solve an auxiliary task of predicting the future features, which is formulated by leveraging the Contrastive Predictive Coding (CPC) loss. Furthermore, a novel hierarchical CPC (HCPC) model is proposed for capturing multi-level temporal dependencies at the same time. The proposed model is evaluated on a wide range of non-speech audio signals, including musical and in-the-wild environmental audio signals. We show that the proposed approach improves the backbone CNNs consistently on all tested benchmark datasets and outperforms a DenseNet model trained from scratch.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115426716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Technique for OFDM Symbol Slicing OFDM符号切片技术
A. Pérez-Neira, M. Lagunas
{"title":"A Technique for OFDM Symbol Slicing","authors":"A. Pérez-Neira, M. Lagunas","doi":"10.1109/ICASSP39728.2021.9414504","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414504","url":null,"abstract":"This work presents an orthonormal transform that splits the Orthogonal Frequency Division Multiplex (OFDM) symbol into slices with ranked rate and decoding complexity. The advantage over the existing carrier or time segmentation is that the proposed technique does not depend on the frequency channel to produce slices of equal rate. Also, the encoding and the decoding complexity is kept simple.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115711064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Layered Embedding-Based Scheme to Cope with Intra-Frame Distortion Drift In IPM-Based HEVC Steganography 基于ipm的HEVC隐写中基于分层嵌入的帧内失真漂移处理方案
Xiaoqing Jia, Jie Wang, Yongliang Liu, Xiangui Kang, Yun-Qing Shi
{"title":"A Layered Embedding-Based Scheme to Cope with Intra-Frame Distortion Drift In IPM-Based HEVC Steganography","authors":"Xiaoqing Jia, Jie Wang, Yongliang Liu, Xiangui Kang, Yun-Qing Shi","doi":"10.1109/ICASSP39728.2021.9413728","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413728","url":null,"abstract":"The spatial correlation of the intra-frame prediction units brings great challenges when minimizing embedding distortions using syndrome-trellis coding (STC) in High Efficiency Video Coding (HEVC) steganography. To solve this problem, we propose a layered embedding scheme which embeds information into the intra-prediction modes (IPMs) of 4×4 intra-frame prediction units (PUs) in HEVC. Firstly we divide the PUs of the intra-frame into different layers using Hasse diagram and make modification decisions for PUs in each layer respectively to decorrelate the correlated PUs. Secondly we make a statistics on more than 100,000 sampling PU pairs to quantitatively analyze the impacts between the distortions of PUs and then design a distortion function which takes mutual impacts of PUs into account. Experimental results show that our method can significantly reduce the embedding distortion and improve the security compared with the existing STC-based steganography methods embedding in IPMs.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114655037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transitive Transfer Sparse Coding for Distant Domain 远距离域的传递转移稀疏编码
Lingtian Feng, Feng Qian, Xin He, Yuqi Fan, H. Cai, Guangmin Hu
{"title":"Transitive Transfer Sparse Coding for Distant Domain","authors":"Lingtian Feng, Feng Qian, Xin He, Yuqi Fan, H. Cai, Guangmin Hu","doi":"10.1109/ICASSP39728.2021.9415021","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9415021","url":null,"abstract":"The transfer learning between the source and target domain has already achieved significant success in machine learning areas. However, the existing methods can not achieve satisfactory result when solving the two distant domains transfer learning problem. In the worst case, it could lead to the negative transfer. In this paper, we propose a novel framework called transitive transfer sparse coding (TTSC) to solve the two distant domains transfer learning problem. On the one hand, as an extension of the sparse coding, the TTSC framework constructs a robust and high-level dictionary across three different domains and simultaneously obtains three good feature sparse representations. On the other hand, TTSC utilizes the intermediate domain as a strong bridge to transfer valuable knowledge between the source domain and target domain. Empirical studies validated that the TTSC framework significantly could outperform state-of-the-art methods.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114645949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An Empirical Study on Task-Oriented Dialogue Translation 任务导向对话翻译的实证研究
Siyou Liu
{"title":"An Empirical Study on Task-Oriented Dialogue Translation","authors":"Siyou Liu","doi":"10.1109/ICASSP39728.2021.9413521","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413521","url":null,"abstract":"Translating conversational text, in particular task-oriented dialogues, is an important application task for machine translation technology. However, it has so far not been extensively explored due to its inherent characteristics including data limitation, discourse, informality and personality. In this paper, we systematically investigate advanced models on the task-oriented dialogue translation task, including sentence-level, document-level and non-autoregressive NMT models. Be-sides, we explore existing techniques such as data selection, back/forward translation, larger batch learning, finetuning and domain adaptation. To alleviate low-resource problem, we transfer general knowledge from four different pre-training models to the downstream task. Encouragingly, we find that the best model with mBART pre-training pushes the SOTA performance on WMT20 English-German and IWSLT DIALOG Chinese-English datasets up to 62.67 and 23.21 BLEU points, respectively.1","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121792186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interpolation of Irregularly Sampled Frequency Response Functions Using Convolutional Neural Networks 不规则采样频率响应函数的卷积神经网络插值
M. Acerbi, R. Malvermi, Mirco Pezzoli, F. Antonacci, A. Sarti, R. Corradi
{"title":"Interpolation of Irregularly Sampled Frequency Response Functions Using Convolutional Neural Networks","authors":"M. Acerbi, R. Malvermi, Mirco Pezzoli, F. Antonacci, A. Sarti, R. Corradi","doi":"10.1109/ICASSP39728.2021.9413458","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9413458","url":null,"abstract":"In the field of structural mechanics, classical methods for the vibrational characterization of objects exploit the inherent redundancy of a relevant amount of measurements acquired over regular sampling grids. However, there are cases in which parts of the objects under analysis are not accessible with sensors, leading to irregular sampling grids characterized by holes. Recent works have proved the benefits of adding prior knowledge in these scenarios, either through the definition of a suitable decomposition or using Finite Element modelling. In this paper we propose to use Convolutional Autoencoders (CA) for Frequency Response Function (FRF) interpolation from grids with different subsampling schemes. CA learn a compressed representation from a dataset of FRFs synthetized through Finite Element Analysis. Experiments with numerical and experimental data show the effectiveness of the model with a different amount of missing data and its ability to predict real FRFs characterized by different damping and sampling frequency.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116633706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Incorporate Maximum Mean Discrepancy in Recurrent Latent Space for Sequential Generative Model 在序列生成模型的循环潜空间中引入最大均值差异
Yuchi Zhang, Yongliang Wang, Yang Dong
{"title":"Incorporate Maximum Mean Discrepancy in Recurrent Latent Space for Sequential Generative Model","authors":"Yuchi Zhang, Yongliang Wang, Yang Dong","doi":"10.1109/ICASSP39728.2021.9414580","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414580","url":null,"abstract":"Stochastic recurrent neural networks have shown promising performance for modeling complex sequences. Nonetheless, existing methods adopt KL divergence as distribution regularizations in their latent spaces, which limits the choices of models for latent distribution construction. In this paper, we incorporate maximum mean discrepancy in the recurrent structure for distribution regularization. Maximum mean discrepancy is able to measure the difference between two distributions by just sampling from them, which enables us to construct more complicated latent distributions by neural networks. Therefore, our proposed algorithm is able to model more complex sequences. Experiments conducted on two different sequential modeling tasks show that our method outperforms the state-of-the-art sequential modeling algorithms.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116928570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mixture of Informed Experts for Multilingual Speech Recognition 多语言语音识别的知情专家混合
Neeraj Gaur, B. Farris, Parisa Haghani, Isabel Leal, Pedro J. Moreno, Manasa Prasad, B. Ramabhadran, Yun Zhu
{"title":"Mixture of Informed Experts for Multilingual Speech Recognition","authors":"Neeraj Gaur, B. Farris, Parisa Haghani, Isabel Leal, Pedro J. Moreno, Manasa Prasad, B. Ramabhadran, Yun Zhu","doi":"10.1109/ICASSP39728.2021.9414379","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9414379","url":null,"abstract":"When trained on related or low-resource languages, multilingual speech recognition models often outperform their monolingual counterparts. However, these models can suffer from loss in performance for high resource or unrelated languages. We investigate the use of a mixture-of-experts approach to assign per-language parameters in the model to increase network capacity in a structured fashion. We introduce a novel variant of this approach, ‘informed experts’, which attempts to tackle inter-task conflicts by eliminating gradients from other tasks in these task-specific parameters. We conduct experiments on a real-world task with English, French and four dialects of Arabic to show the effectiveness of our approach. Our model matches or outperforms the monolingual models for almost all languages, with gains of as much as 31% relative. Our model also outperforms the baseline multilingual model for all languages by up to 9% relative.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120936889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Extended Object Tracking With Automotive Radar Using B-Spline Chained Ellipses Model 基于b样条链椭圆模型的汽车雷达扩展目标跟踪
G. Yao, P. Wang, K. Berntorp, Hassan Mansour, P. Boufounos
{"title":"Extended Object Tracking With Automotive Radar Using B-Spline Chained Ellipses Model","authors":"G. Yao, P. Wang, K. Berntorp, Hassan Mansour, P. Boufounos","doi":"10.1109/ICASSP39728.2021.9415080","DOIUrl":"https://doi.org/10.1109/ICASSP39728.2021.9415080","url":null,"abstract":"This paper introduces a B-spline chained ellipses model representation for extended object tracking (EOT) using high-resolution automotive radar measurements. With offline automotive radar training datasets, the proposed model parameters are learned using the expectation-maximization (EM) algorithm. Then the probabilistic multi-hypothesis tracking (PMHT) along with the unscented transform (UT) is proposed to deal with the nonlinear forward-warping coordinate transformation, the measurement-to-ellipsis association, and the state update step. Numerical validation is provided to verify the effectiveness of the proposed EOT framework with automotive radar measurements.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121107666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信