2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Factorization for analog-to-digital matrix multiplication 模数矩阵乘法的因式分解
Edward H. Lee, Madeleine Udell, S. Wong
{"title":"Factorization for analog-to-digital matrix multiplication","authors":"Edward H. Lee, Madeleine Udell, S. Wong","doi":"10.1109/ICASSP.2015.7178132","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178132","url":null,"abstract":"We present matrix factorization as an enabling technique for analog-to-digital matrix multiplication (AD-MM). We show that factorization in the analog domain increases the total precision of AD-MM in precision-limited analog multiplication, reduces the number of analog-to-digital (A/D) conversions needed for overcomplete matrices, and avoids unneeded computations in the digital domain. Finally, we present a factorization algorithm using alternating convex relaxation.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126281421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A sequential dictionary learning algorithm with enforced sparsity 一种具有强制稀疏性的顺序字典学习算法
A. Seghouane, M. Hanif
{"title":"A sequential dictionary learning algorithm with enforced sparsity","authors":"A. Seghouane, M. Hanif","doi":"10.1109/ICASSP.2015.7178697","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178697","url":null,"abstract":"Dictionary learning algorithms have received widespread acceptance when it comes to data analysis and signal representations problems. These algorithms alternate between two stages: the sparse coding stage and dictionary update stage. In all existing dictionary learning algorithms the use of sparsity has been limited to the sparse coding stage while presenting differences in the dictionary update stage which can be achieved sequentially or in parallel. The singular value decomposition (SVD) has been successfully used for sequential dictionary update. In this paper we propose a dictionary learning algorithm that include a sparsity constraint also in the dictionary update stage. The cost function used to include sparsity in the dictionary update stage is derived using the link between SVD and rank one matrix approximation. The effectiveness of the proposed dictionary learning method is tested on synthetic data and an image processing application. The results reveal that including a sparsity constraint in the dictionary update stage is not a bad idea.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126198210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
On the von mises approximation for the distribution of the phase angle between two independent complex Gaussian vectors 两个独立复高斯矢量间相位角分布的von mises近似
N. Letzepis
{"title":"On the von mises approximation for the distribution of the phase angle between two independent complex Gaussian vectors","authors":"N. Letzepis","doi":"10.1109/ICASSP.2015.7178571","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178571","url":null,"abstract":"This paper analyses the von Mises approximation for the distribution of the phase angle between two independent complex Gaussian vectors. By upper bounding the Kullback-Leibler divergence, it is shown that when their circular means and variances coincide, the distribution converges to a von Mises distribution both in the low and high signal-to-noise ratio regimes.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126570981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Utilizing spectro-temporal correlations for an improved speech presence probability based noise power estimation 利用频谱时间相关性改进语音存在概率噪声功率估计
Martin Krawczyk-Becker, Dörte Fischer, Timo Gerkmann
{"title":"Utilizing spectro-temporal correlations for an improved speech presence probability based noise power estimation","authors":"Martin Krawczyk-Becker, Dörte Fischer, Timo Gerkmann","doi":"10.1109/ICASSP.2015.7177992","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7177992","url":null,"abstract":"For the enhancement of speech degraded by noise, accurate estimation of the noise power spectral density (PSD) is indispensable, especially if only a single microphone signal is available. Fast and accurate tracking of the noise PSD is particularly challenging in highly non-stationary noise types, since the distinction between speech and noise components becomes more difficult. Short-time discrete Fourier transform (STFT) based noise PSD estimation algorithms which employ estimates of the speech presence probability (SPP) with fixed priors have been shown to yield good tracking performance even in adverse noise conditions. In this paper, we compare two methods to incorporate spectro-temporal correlations to improve the tracking performance. The first method smoothes the noisy observation over time and frequency before computing the SPP, while the second is based on a Hidden Markov Model (HMM) of the speech presence and absence states. We show that the proposed modifications lead to improved noise PSD estimators which are less sensitive to spectral outliers of the noise and track changes in the noise PSD more quickly than the reference method. Further, when employed in a common speech enhancement setup, the proposed estimators achieve an increased noise reduction while keeping speech distortions at a comparable level.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122262047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Histogram-PMHT with an evolving Poisson prior 具有进化泊松先验的pmht直方图
H. Vu, S. Davey, S. Arulampalam, F. Fletcher, C. Lim
{"title":"Histogram-PMHT with an evolving Poisson prior","authors":"H. Vu, S. Davey, S. Arulampalam, F. Fletcher, C. Lim","doi":"10.1109/ICASSP.2015.7178734","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178734","url":null,"abstract":"The Histogram-Probabilistic Multi-Hypothesis Tracker (H-PMHT) is an efficient multi-target tracking approach to the Track-Before-Detect (TkBD) problem. However, it cannot adequately deal with fluctuating targets and this can degrade track management performance. By assuming an alternative measurement model based on a Poisson distribution, the H-PMHT algorithm can be re-derived to incorporate a time-correlated estimate of the component mixing terms, allowing for an improved measure for track quality.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122280452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Location-aware object detection via coherent region grouping 通过相干区域分组的位置感知目标检测
Shen-Chi Chen, Kevin Lin, Chu-Song Chen, Y. Hung
{"title":"Location-aware object detection via coherent region grouping","authors":"Shen-Chi Chen, Kevin Lin, Chu-Song Chen, Y. Hung","doi":"10.1109/ICASSP.2015.7178178","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178178","url":null,"abstract":"We present a scene adaptation algorithm for object detection. Our method discovers scene-dependent features discriminative to classifying foreground objects into different categories. Unlike previous works suffering from insufficient training data collected online, our approach incorporated with a similarity grouping procedure can automatically gather more consistent training examples from a neighbour area. Experimental results show that the proposed method outperforms several related works with higher detection accuracies.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127944762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Tyler's estimator performance analysis 泰勒估计器性能分析
I. Soloveychik, A. Wiesel
{"title":"Tyler's estimator performance analysis","authors":"I. Soloveychik, A. Wiesel","doi":"10.1109/ICASSP.2015.7179061","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7179061","url":null,"abstract":"This paper analyzes the performance of Tyler's M-estimator of the scatter matrix in elliptical populations. We focus on non-asymptotic performance analysis of Tyler's estimator. Given n samples of dimension p <; n, we show that the squared Frobenius norm of the error of the inverse estimator is proportional to p2/(1-c2)2n with high probability, where c is the coherence coefficient of the properly scaled estimator. Under additional group symmetry conditions we improve the obtained bound, utilizing the inherent sparsity properties of group symmetry.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128193635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Combination of two-dimensional cochleogram and spectrogram features for deep learning-based ASR 基于深度学习的ASR的二维耳蜗图和声谱图特征的结合
Andros Tjandra, S. Sakti, Graham Neubig, T. Toda, M. Adriani, Satoshi Nakamura
{"title":"Combination of two-dimensional cochleogram and spectrogram features for deep learning-based ASR","authors":"Andros Tjandra, S. Sakti, Graham Neubig, T. Toda, M. Adriani, Satoshi Nakamura","doi":"10.1109/ICASSP.2015.7178827","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178827","url":null,"abstract":"This paper explores the use of auditory features based on cochleograms; two dimensional speech features derived from gammatone filters within the convolutional neural network (CNN) framework. Furthermore, we also propose various possibilities to combine cochleogram features with log-mel filter banks or spectrogram features. In particular, we combine within low and high levels of CNN framework which we refer to as low-level and high-level feature combination. As comparison, we also construct the similar configuration with deep neural network (DNN). Performance was evaluated in the framework of hybrid neural network - hidden Markov model (NN-HMM) system on TIMIT phoneme sequence recognition task. The results reveal that cochleogram-spectrogram feature combination provides significant advantages. The best accuracy was obtained by high-level combination of two dimensional cochleogram-spectrogram features using CNN, achieved up to 8.2% relative phoneme error rate (PER) reduction from CNN single features or 19.7% relative PER reduction from DNN single features.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121720037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
An iterative deflation algorithm for exact CP tensor decomposition 精确CP张量分解的迭代压缩算法
A. P. D. Silva, P. Comon, A. D. Almeida
{"title":"An iterative deflation algorithm for exact CP tensor decomposition","authors":"A. P. D. Silva, P. Comon, A. D. Almeida","doi":"10.1109/ICASSP.2015.7178714","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178714","url":null,"abstract":"The Canonical Polyadic (CP) tensor decomposition has become an attractive mathematical tool these last ten years in various fields. Yet, efficient algorithms are still lacking to compute the full CP decomposition, whereas rank-one approximations are rather easy to compute. We propose a new deflation-based iterative algorithm allowing to compute the full CP decomposition, by resorting only to rank-one approximations. An analysis of convergence issues is included, as well as computer experiments. Our theoretical and experimental results show that the algorithm converges almost surely.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"288 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115893293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Joint training of front-end and back-end deep neural networks for robust speech recognition 面向鲁棒语音识别的前端和后端深度神经网络联合训练
Tian Gao, Jun Du, Lirong Dai, Chin-Hui Lee
{"title":"Joint training of front-end and back-end deep neural networks for robust speech recognition","authors":"Tian Gao, Jun Du, Lirong Dai, Chin-Hui Lee","doi":"10.1109/ICASSP.2015.7178797","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178797","url":null,"abstract":"Based on the recently proposed speech pre-processing front-end with deep neural networks (DNNs), we first investigate different feature mapping directly from noisy speech via DNN for robust speech recognition. Next, we propose to jointly train a single DNN for both feature mapping and acoustic modeling. In the end, we show that the word error rate (WER) of the jointly trained system could be significantly reduced by the fusion of multiple DNN pre-processing systems which implies that features obtained from different domains of the DNN-enhanced speech signals are strongly complementary. Testing on the Aurora4 noisy speech recognition task our best system with multi-condition training can achieves an average WER of 10.3%, yielding a relative reduction of 16.3% over our previous DNN pre-processing only system with a WER of 12.3%. To the best of our knowledge, this represents the best published result on the Aurora4 task without using any adaptation techniques.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132030919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信