ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Detecting Adversarial Attacks In Time-Series Data 检测时间序列数据中的对抗性攻击
Mubarak G. Abdu-Aguye, W. Gomaa, Yasushi Makihara, Y. Yagi
{"title":"Detecting Adversarial Attacks In Time-Series Data","authors":"Mubarak G. Abdu-Aguye, W. Gomaa, Yasushi Makihara, Y. Yagi","doi":"10.1109/ICASSP40776.2020.9053311","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053311","url":null,"abstract":"In recent times, deep neural networks have seen increased adoption in highly critical tasks. They are also susceptible to adversarial attacks, which are specifically crafted changes made to input samples which lead to erroneous output from such models. Such attacks have been shown to affect different types of data such as images and more recently, time-series data. Such susceptibility could have catastrophic consequences, depending on the domain.We propose a method for detecting Fast Gradient Sign Method (FGSM) and Basic Iterative Method (BIM) adversarial attacks as adapted for time-series data. We frame the problem as an instance of outlier detection and construct a normalcy model based on information and chaos-theoretic measures, which can then be used to determine whether unseen samples are normal or adversarial. Our approach shows promising performance on several datasets from the 2015 UCR Time Series Archive, reaching up to 97% detection accuracy in the best case.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"16 1","pages":"3092-3096"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87766345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Channel Charting: an Euclidean Distance Matrix Completion Perspective 通道图表:欧几里得距离矩阵完成视角
Patrick Agostini, Z. Utkovski, S. Stańczak
{"title":"Channel Charting: an Euclidean Distance Matrix Completion Perspective","authors":"Patrick Agostini, Z. Utkovski, S. Stańczak","doi":"10.1109/ICASSP40776.2020.9053639","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053639","url":null,"abstract":"Channel charting (CC) is an emerging machine learning framework that aims at learning lower-dimensional representations of the radio geometry from collected channel state information (CSI) in an area of interest, such that spatial relations of the representations in the different domains are preserved. Extracting features capable of correctly representing spatial properties between positions is crucial for learning reliable channel charts. Most approaches to CC in the literature rely on range distance estimates, which have the drawback that they only provide accurate distance information for colinear positions. Distances between positions with large azimuth separation are constantly underestimated using these approaches, and thus incorrectly mapped to close neighborhoods. In this paper, we introduce a correlation matrix distance (CMD) based dissimilarity measure for CC that allows us to group CSI measurements according to their co-linearity. This provides us with the capability to discard points for which large distance errors are made, and to build a neighborhood graph between approximately collinear positions. The neighborhood graph allows us to state the problem of CC as an instance of an Euclidean distance matrix completion (EDMC) problem where side-information can be naturally introduced via convex box-constraints.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"12 1","pages":"5010-5014"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88311828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
A Fast and Accurate Frequent Directions Algorithm for Low Rank Approximation via Block Krylov Iteration 基于块Krylov迭代的快速准确的低秩逼近频繁方向算法
Qianxin Yi, Chenhao Wang, Xiuwu Liao, Yao Wang
{"title":"A Fast and Accurate Frequent Directions Algorithm for Low Rank Approximation via Block Krylov Iteration","authors":"Qianxin Yi, Chenhao Wang, Xiuwu Liao, Yao Wang","doi":"10.1109/ICASSP40776.2020.9054022","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054022","url":null,"abstract":"It is known that frequent directions (FD) is a popular deterministic matrix sketching technique for low rank approximation. However, FD and its randomized variants usually meet high computational cost or computational instability in dealing with large-scale datasets, which limits their use in practice. To remedy such issues, this paper aims at improving the efficiency and effectiveness of FD. Specifically, by utilizing the power of Block Krylov Iteration and count sketch techniques, we propose a fast and accurate FD algorithm dubbed as BKICS-FD. We derive the error bound of the proposed BKICS-FD and then carry out extensive numerical experiments to illustrate its superiority over several popular FD algorithms, both in terms of computational speed and accuracy.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"3167-3171"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86368369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural Lattice Search for Speech Recognition 语音识别的神经格搜索
Rao Ma, Hao Li, Qi Liu, Lu Chen, Kai Yu
{"title":"Neural Lattice Search for Speech Recognition","authors":"Rao Ma, Hao Li, Qi Liu, Lu Chen, Kai Yu","doi":"10.1109/ICASSP40776.2020.9054109","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054109","url":null,"abstract":"To improve the accuracy of automatic speech recognition, a two-pass decoding strategy is widely adopted. The first-pass model generates compact word lattices, which are utilized by the second-pass model to perform rescoring. Currently, the most popular rescoring methods are N-best rescoring and lattice rescoring with long short-term memory language models (LSTMLMs). However, these methods encounter the problem of limited search space or inconsistency between training and evaluation. In this paper, we address these problems with an end-to-end model for accurately extracting the best hypothesis from the word lattice. Our model is composed of a bidirectional LatticeLSTM encoder followed by an attentional LSTM decoder. The model takes word lattice as input and generates the single best hypothesis from the given lattice space. When combined with an LSTMLM, the proposed model yields 9.7% and 7.5% relative WER reduction compared to N-best rescoring methods and lattice rescoring methods within the same amount of decoding time.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"75 1","pages":"7794-7798"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86380228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
DNN-Based Speech Recognition for Globalphone Languages 基于dnn的全球电话语言语音识别
Martha Yifiru Tachbelie, Ayimunishagu Abulimiti, S. Abate, Tanja Schultz
{"title":"DNN-Based Speech Recognition for Globalphone Languages","authors":"Martha Yifiru Tachbelie, Ayimunishagu Abulimiti, S. Abate, Tanja Schultz","doi":"10.1109/ICASSP40776.2020.9053144","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053144","url":null,"abstract":"This paper describes new reference benchmark results based on hybrid Hidden Markov Model and Deep Neural Networks (HMM-DNN) for the GlobalPhone (GP) multilingual text and speech database. GP is a multilingual database of high-quality read speech with corresponding transcriptions and pronunciation dictionaries in more than 20 languages. Moreover, we provide new results for five additional languages, namely, Amharic, Oromo, Tigrigna, Wolaytta, and Uyghur. Across the 22 languages considered, the hybrid HMM-DNN models outperform the HMM-GMM based models regardless of the size of the training speech used. Overall, we achieved relative improvements that range from 7.14% to 59.43%.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"21 1","pages":"8269-8273"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86490635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Multi-Scale Residual Network for Image Classification 图像分类的多尺度残差网络
X. Zhong, Oubo Gong, Wenxin Huang, Jingling Yuan, Bo Ma, R. W. Liu
{"title":"Multi-Scale Residual Network for Image Classification","authors":"X. Zhong, Oubo Gong, Wenxin Huang, Jingling Yuan, Bo Ma, R. W. Liu","doi":"10.1109/ICASSP40776.2020.9053478","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053478","url":null,"abstract":"Multi-scale approach representing image objects at various levels-of-details has been applied to various computer vision tasks. Existing image classification approaches place more emphasis on multi-scale convolution kernels, and overlook multi-scale feature maps. As such, some shallower information of the network will not be fully utilized. In this paper, we propose the Multi-Scale Residual (MSR) module that integrates multi-scale feature maps of the underlying information to the last layer of Convolutional Neural Network. Our proposed method significantly enhances the characteristics of the information in the final classification. Extensive experiments conducted on CIFAR100, Tiny-ImageNet and large-scale CalTech-256 datasets demonstrate the effectiveness of our method compared with Res-Family.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"27 1","pages":"2023-2027"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86528753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Time-Frequency Network with Channel Attention and Non-Local Modules for Artificial Bandwidth Extension 一种具有信道关注和非局部模块的时频网络,用于人工带宽扩展
Yuanjie Dong, Yaxing Li, Xiaoqi Li, Shanjie Xu, Dan Wang, Zhihui Zhang, Shengwu Xiong
{"title":"A Time-Frequency Network with Channel Attention and Non-Local Modules for Artificial Bandwidth Extension","authors":"Yuanjie Dong, Yaxing Li, Xiaoqi Li, Shanjie Xu, Dan Wang, Zhihui Zhang, Shengwu Xiong","doi":"10.1109/ICASSP40776.2020.9053769","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053769","url":null,"abstract":"Convolution neural networks (CNNs) have been achieving increasing attention for the artificial bandwidth extension (ABE) task recently. However, these methods use the flipped low-frequency phase to reconstruct speech signals, which may lead to the well-known invalid short-time Fourier Transform (STFT) problem. The convolutional operations only enable networks to construct informative features by fusing both channel-wise and spatial information within local receptive fields at each layer. In this paper, we introduce a Time-Frequency Network (TFNet) with channel attention (CA) and non-local (NL) modules for ABE. The TFNet exploits the information from both time and frequency domain branches concurrently to avoid the invalid STFT problem. To capture the channels and space dependencies, we incorporate the CA and NL modules to construct a proposed fully convolutional neural network for the time and frequency branches of TFNet. Experimental results demonstrate that the proposed method outperforms the competing method.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"12 1","pages":"6954-6958"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83671092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Motion Feedback Design for Video Frame Interpolation 视频帧插值的运动反馈设计
Mengshun Hu, Liang Liao, Jing Xiao, Lin Gu, S. Satoh
{"title":"Motion Feedback Design for Video Frame Interpolation","authors":"Mengshun Hu, Liang Liao, Jing Xiao, Lin Gu, S. Satoh","doi":"10.1109/ICASSP40776.2020.9053223","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053223","url":null,"abstract":"This paper introduces a feedback-based approach to interpolate video frames involving small and fast-moving objects. Unlike the existing feedforward-based methods that estimate optical flow and synthesize in-between frames sequentially, we introduce a motion-oriented component that adds a feedback block to the existing multi-scale autoencoder pipeline, which feedbacks information of small objects shared between architectures of two different scales. We show that feeding this additional information enables more robust detection of optical flow caused by small objects in fast motion. Using experiments on various datasets, we show that the feedback mechanism allows our method to achieve state-of-the-art results, both qualitatively and quantitatively.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"2016 1","pages":"4347-4351"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82628509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Dynamic Channel Pruning For Correlation Filter Based Object Tracking 基于相关滤波的目标跟踪动态通道剪枝
Goutam Yelluru Gopal, Maria A. Amer
{"title":"Dynamic Channel Pruning For Correlation Filter Based Object Tracking","authors":"Goutam Yelluru Gopal, Maria A. Amer","doi":"10.1109/ICASSP40776.2020.9053333","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9053333","url":null,"abstract":"Fusion of multi-channel representations has played a crucial role in the success of correlation filter (CF) based trackers. But, all channels do not contain useful information for target localization at every frame. During challenging scenarios, ambiguous responses of non-discriminative or unreliable channels lead to erroneous results and cause tracker drift. To mitigate this problem, we propose a method for dynamic channel pruning through online (i.e., at every frame) learning of channel weights. Our method uses estimated reliability scores to compute channel weights, to nullify the impact of highly unreliable channels. The proposed method for learning of channel weights is modeled as a non-smooth convex optimization problem. We then propose an algorithm to solve the resulting problem efficiently compared to off-the-shelf solvers. Results on VOT2018 and TC128 datasets show that proposed method improves the performance of baseline CF trackers.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"172 1","pages":"5700-5704"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82937761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Deriving Compact Feature Representations Via Annealed Contraction 通过退火收缩导出紧凑特征表示
Muhammad A Shah, B. Raj
{"title":"Deriving Compact Feature Representations Via Annealed Contraction","authors":"Muhammad A Shah, B. Raj","doi":"10.1109/ICASSP40776.2020.9054527","DOIUrl":"https://doi.org/10.1109/ICASSP40776.2020.9054527","url":null,"abstract":"It is common practice to use pretrained image recognition models to compute feature representations for visual data. The size of the feature representations can have a noticeable impact on the complexity of the models that use these representations, and by extension on their deployablity and scalability. Therefore it would be beneficial to have compact visual representations that carry as much information as their high-dimensional counterparts. To this end we propose a technique that shrinks a layer by an iterative process in which neurons are removed from the and network is fine tuned. Using this technique we are able to remove 99% of the neurons from the penultimate layer of AlexNet and VGG16, while suffering less than 5% drop in accuracy on CIFAR10, Caltech101 and Caltech256. We also show that our method can reduce the size of AlexNet by 95% while only suffering a 4% reduction in accuracy on Caltech101.","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"70 1","pages":"2068-2072"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88956095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信