ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Adaptive Reduced-Dimensional Beamspace Beamformer Design by Analogue Beam Selection 基于模拟波束选择的自适应降维波束空间波束形成器设计
Xiangrong Wang, E. Aboutanios
{"title":"Adaptive Reduced-Dimensional Beamspace Beamformer Design by Analogue Beam Selection","authors":"Xiangrong Wang, E. Aboutanios","doi":"10.1109/ICASSP.2019.8683360","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683360","url":null,"abstract":"Adaptive beamforming of large antenna arrays is difficult to implement due to prohibitively high hardware cost and computational complexity. An antenna selection strategy was utilized to maximize the output signal-to-interference-plus- noise ratio (SINR) with fewer antennas by optimizing array configurations. However, antenna selection scheme exhibits high degradation in performance compared to the full array system. In this paper, we consider a reduced-dimensional beamspace beamformer, where analogue phase shifters adaptively synthesize a subset of orthogonal beams whose outputs are then processed in a beamspace beamformer. We examine the selection problem to adaptively identify the beams most relevant to achieving almost the full beamspace performance, especially in the generalized case without any prior information. Simulation results demonstrated that the beam selection enjoys the complexity advantages, while simultaneously enhancing the output SINR of antenna selection.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"8 1","pages":"4350-4354"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88101650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Reflection Symmetry Detection by Embedding Symmetry in a Graph 在图中嵌入对称的反射对称检测
R. Nagar, S. Raman
{"title":"Reflection Symmetry Detection by Embedding Symmetry in a Graph","authors":"R. Nagar, S. Raman","doi":"10.1109/ICASSP.2019.8682412","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682412","url":null,"abstract":"Reflection symmetry is ubiquitous in nature and plays an important role in object detection and recognition tasks. Most of the existing methods for symmetry detection extract and describe each keypoint using a descriptor and a mirrored descriptor. Two keypoints are said to be mirror symmetric key-points if the original descriptor of one keypoint and the mirrored descriptor of the other keypoint are similar. However, these methods suffer from the following issue. The background pixels around the mirror symmetric pixels lying on the boundary of an object can be different. Therefore, their descriptors can be different. However, the boundary of a symmetric object is a major component of global reflection symmetry. We exploit the estimated boundary of the object and describe a boundary pixel using only the estimated normal of the boundary segment around the pixel. We embed the symmetry axes in a graph as cliques to robustly detect the symmetry axes. We show that this approach achieves state-of-the-art results in a standard dataset.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"4012 2 1","pages":"2147-2151"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86699508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Asymptotically Optimal Recovery of Gaussian Sources from Noisy Stationary Mixtures: the Least-noisy Maximally-separating Solution 高斯源在有噪声平稳混合中的渐近最优恢复:最小噪声最大分离解
A. Weiss, A. Yeredor
{"title":"Asymptotically Optimal Recovery of Gaussian Sources from Noisy Stationary Mixtures: the Least-noisy Maximally-separating Solution","authors":"A. Weiss, A. Yeredor","doi":"10.1109/ICASSP.2019.8682761","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682761","url":null,"abstract":"We address the problem of source separation from noisy mixtures in a semi-blind scenario, with stationary, temporally-diverse Gaussian sources and known spectra. In such noisy models, a dilemma arises regarding the desired objective. On one hand, a \"maximally separating\" solution, providing the minimal attainable Interference-to-Source-Ratio (ISR), would often suffer from significant residual noise. On the other hand, optimal Minimum Mean Square Error (MMSE) estimation would yield estimates which are the \"least distorted\" versions of the true sources, often at the cost of compromised ISR. Based on Maximum Likelihood (ML) estimation of the unknown underlying model parameters, we propose two ML-based estimates of the sources. One asymptotically coincides with the MMSE estimate of the sources, whereas the other asymptotically coincides with the (unbiased) \"least-noisy maximally-separating\" solution for this model. We prove the asymptotic optimality of the latter and present the corresponding Cramér-Rao lower bound. We discuss the differences in principal properties of the proposed estimates and demonstrate them empirically using simulation results.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"46 1","pages":"5466-5470"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85470901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evolutionary Subspace Clustering: Discovering Structure in Self-expressive Time-series Data 演化子空间聚类:发现自表达时间序列数据的结构
Abolfazl Hashemi, H. Vikalo
{"title":"Evolutionary Subspace Clustering: Discovering Structure in Self-expressive Time-series Data","authors":"Abolfazl Hashemi, H. Vikalo","doi":"10.1109/ICASSP.2019.8682405","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682405","url":null,"abstract":"An evolutionary self-expressive model for clustering a collection of evolving data points that lie on a union of low-dimensional evolving subspaces is proposed. A parsimonious representation of data points at each time step is learned via a non-convex optimization framework that exploits the self-expressiveness property of the evolving data while taking into account data representation from the preceding time step. The resulting scheme adaptively learns an innovation matrix that captures changes in self-representation of data in consecutive time steps as well as a smoothing parameter reflective of the rate of data evolution. Extensive experiments demonstrate superiority of the proposed framework overs state-of-the-art static subspace clustering algorithms and existing evolutionary clustering schemes.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"73 1","pages":"3707-3711"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85715959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Bayesian and Gaussian Process Neural Networks for Large Vocabulary Continuous Speech Recognition 大词汇量连续语音识别的贝叶斯和高斯过程神经网络
Shoukang Hu, Max W. Y. Lam, Xurong Xie, Shansong Liu, Jianwei Yu, Xixin Wu, Xunying Liu, H. Meng
{"title":"Bayesian and Gaussian Process Neural Networks for Large Vocabulary Continuous Speech Recognition","authors":"Shoukang Hu, Max W. Y. Lam, Xurong Xie, Shansong Liu, Jianwei Yu, Xixin Wu, Xunying Liu, H. Meng","doi":"10.1109/ICASSP.2019.8682487","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682487","url":null,"abstract":"The hidden activation functions inside deep neural networks (DNNs) play a vital role in learning high level discriminative features and controlling the information flows to track longer history. However, the fixed model parameters used in standard DNNs can lead to over-fitting and poor generalization when given limited training data. Furthermore, the precise forms of activations used in DNNs are often manually set at a global level for all hidden nodes, thus lacking an automatic selection method. In order to address these issues, Bayesian neural networks (BNNs) acoustic models are proposed in this paper to explicitly model the uncertainty associated with DNN parameters. Gaussian Process (GP) activations based DNN and LSTM acoustic models are also used in this paper to allow the optimal forms of hidden activations to be stochastically learned for individual hidden nodes. An efficient variational inference based training algorithm is derived for BNN, GPNN and GPLSTM systems. Experiments were conducted on a LVCSR system trained on a 75 hour subset of Switchboard I data. The best BNN and GPNN systems outperformed both the baseline DNN systems constructed using fixed form activations and their combination via frame level joint decoding by 1% absolute in word error rate.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"9 1","pages":"6555-6559"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85768922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Neural Variational Identification and Filtering for Stochastic Non-linear Dynamical Systems with Application to Non-intrusive Load Monitoring 随机非线性动力系统的神经变分辨识与滤波及其在非侵入式负荷监测中的应用
Henning Lange, M. Berges, J. Z. Kolter
{"title":"Neural Variational Identification and Filtering for Stochastic Non-linear Dynamical Systems with Application to Non-intrusive Load Monitoring","authors":"Henning Lange, M. Berges, J. Z. Kolter","doi":"10.1109/ICASSP.2019.8683552","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683552","url":null,"abstract":"In this paper, an algorithm for performing System Identification and inference of the filtering recursion for stochastic non-linear dynamical systems is introduced. Additionally, the algorithm allows for enforcing domain-constraints of the state variable. The algorithm makes use of an approximate inference technique called Variational Inference in conjunction with Deep Neural Networks as the optimization engine. Although general in its nature, the algorithm is evaluated in the context of Non-Intrusive Load Monitoring, the problem of inferring the operational state of individual electrical appliances given aggregate measurements of electrical power collected in a home.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"8 1","pages":"8340-8344"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82402365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Surgical Activities Recognition Using Multi-scale Recurrent Networks 基于多尺度递归网络的手术活动识别
Ilker Gurcan, H. Nguyen
{"title":"Surgical Activities Recognition Using Multi-scale Recurrent Networks","authors":"Ilker Gurcan, H. Nguyen","doi":"10.1109/ICASSP.2019.8683849","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683849","url":null,"abstract":"Recently, surgical activity recognition has been receiving significant attention from the medical imaging community. Existing state-of-the-art approaches employ recurrent neural networks such as long-short term memory networks (LSTMs). However, our experiments show that these networks are not effective in capturing the relationship of features with different temporal scales. Such limitation will lead to sub-optimal recognition performance of surgical activities containing complex motions at multiple time scales. To overcome this shortcoming, our paper proposes a multi-scale recurrent neural network (MS-RNN) that combines the strength of both wavelet scattering operations and LSTM. We validate the effectiveness of the proposed network using both real and synthetic datasets. Our experimental results show that MS-RNN outperforms state-of-the-art methods in surgical activity recognition by a significant margin. On a synthetic dataset, the proposed network achieves more than 90% classification accuracy while LSTM’s accuracy is around chance level. Experiments on real surgical activity dataset shows a significant improvement of recognition accuracy over the current state of the art (90.2% versus 83.3%).","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"13 1","pages":"2887-2891"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82478817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Multi-step Self-attention Network for Cross-modal Retrieval Based on a Limited Text Space 基于有限文本空间的跨模态检索多步自关注网络
Zheng Yu, Wenmin Wang, Ge Li
{"title":"Multi-step Self-attention Network for Cross-modal Retrieval Based on a Limited Text Space","authors":"Zheng Yu, Wenmin Wang, Ge Li","doi":"10.1109/ICASSP.2019.8682424","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682424","url":null,"abstract":"Cross-modal retrieval has been recently proposed to find an appropriate subspace where the similarity among different modalities, such as image and text, can be directly measured. In this paper, we propose Multi-step Self-Attention Network (MSAN) to perform cross-modal retrieval in a limited text space with multiple attention steps, that can selectively attend to partial shared information at each step and aggregate useful information over multiple steps to measure the final similarity. In order to achieve better retrieval results with faster training speed, we introduce global prior knowledge as the global reference information. Extensive experiments on Flickr30K and MSCOCO, show that MSAN achieves new state-of-the-art results in accuracy for cross-modal retrieval.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"16 1","pages":"2082-2086"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82559134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dilated Residual Network with Multi-head Self-attention for Speech Emotion Recognition 基于多头自注意的扩展残差网络语音情绪识别
Runnan Li, Zhiyong Wu, Jia Jia, Sheng Zhao, H. Meng
{"title":"Dilated Residual Network with Multi-head Self-attention for Speech Emotion Recognition","authors":"Runnan Li, Zhiyong Wu, Jia Jia, Sheng Zhao, H. Meng","doi":"10.1109/ICASSP.2019.8682154","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682154","url":null,"abstract":"Speech emotion recognition (SER) plays an important role in intelligent speech interaction. One vital challenge in SER is to extract emotion-relevant features from speech signals. In state-of-the-art SER techniques, deep learning methods, e.g, Convolutional Neural Networks (CNNs), are widely employed for feature learning and have achieved significant performance. However, in the CNN-oriented methods, two performance limitations have raised: 1) the loss of temporal structure of speech in the progressive resolution reduction; 2) the ignoring of relative dependencies between elements in suprasegmental feature sequence. In this paper, we proposed the combining use of Dilated Residual Network (DRN) and Multi-head Self-attention to alleviate the above limitations. By employing DRN, the network can retain high resolution of temporal structure in feature learning, with similar size of receptive field to CNN based approach. By employing Multi-head Self-attention, the network can model the inner dependencies between elements with different positions in the learned suprasegmental feature sequence, which enhances the importing of emotion-salient information. Experiments on emotional benchmarking dataset IEMOCAP have demonstrated the effectiveness of the proposed framework, with 11.7% to 18.6% relative improvement to state-of-the-art approaches.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"80 1 1","pages":"6675-6679"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89560647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Baseline Wander Removal and Isoelectric Correction in Electrocardiograms Using Clustering 基于聚类的心电图基线漂移去除和等电校正
Kjell Le, T. Eftestøl, K. Engan, Ø. Kleiven, S. Ørn
{"title":"Baseline Wander Removal and Isoelectric Correction in Electrocardiograms Using Clustering","authors":"Kjell Le, T. Eftestøl, K. Engan, Ø. Kleiven, S. Ørn","doi":"10.1109/ICASSP.2019.8683084","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683084","url":null,"abstract":"Baseline wander is a low frequency noise which is often removed by a highpass filter in electrocardiogram signals. However, this might not be sufficient to correct the isoelectric level of the signal, there exist an isoelectric bias. The isoelectric level is used as a reference point for amplitude measurements, and is recommended to have this point at 0 V, i.e. isoelectric adjusted. To correct the isoelectric level a clustering method is proposed to determine the isoelectric bias, which is thereafter subtracted from a signal averaged template. Calculation of the mean electrical axis (MEA) is used to evaluate the iso-electric correction. The MEA can be estimated from any lead pairs in the frontal plane, and a low variance in the estimates over the different lead pairs would suggest that the calculation of the MEA in each lead pair are consistent. Different methods are evaluated for calculating MEA, and the variance in the results as well as other measures, favour the proposed isoelectric adjusted signals in all MEA methods.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"1274-1278"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90022306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信