2013 IEEE International Conference on Acoustics, Speech and Signal Processing最新文献_第2页

Regularized Adaboost for content identification 用于内容识别的规范化Adaboost

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-10-21 DOI: 10.1109/ICASSP.2013.6638224

Honghai Yu, P. Moulin

引用次数: 5

Robust low-complexity multichannel equalization for dereverberation 鲁棒低复杂度多通道均衡去噪

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-10-21 DOI: 10.1109/ICASSP.2013.6637736

Felicia Lim, P. Naylor

引用次数: 2

Robust joint sparse recovery on data with outliers 带有异常值数据的鲁棒联合稀疏恢复

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-10-21 DOI: 10.1109/ICASSP.2013.6638373

Ozgur Balkan, K. Kreutz-Delgado, S. Makeig

引用次数: 2

Enhancing query expansion for semantic retrieval of spoken content with automatically discovered acoustic patterns 通过自动发现声学模式，增强口语内容语义检索的查询扩展

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-10-21 DOI: 10.1109/ICASSP.2013.6639283

Hung-yi Lee, Yun-Chiao Li, Cheng-Tao Chung, Lin-Shan Lee

{"title":"Enhancing query expansion for semantic retrieval of spoken content with automatically discovered acoustic patterns","authors":"Hung-yi Lee, Yun-Chiao Li, Cheng-Tao Chung, Lin-Shan Lee","doi":"10.1109/ICASSP.2013.6639283","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6639283","url":null,"abstract":"Query expansion techniques were originally developed for text information retrieval in order to retrieve the documents not containing the query terms but semantically related to the query. This is achieved by assuming the terms frequently occurring in the top-ranked documents in the first-pass retrieval results to be query-related and using them to expand the query to do the second-pass retrieval. However, when this approach was used for spoken content retrieval, the inevitable recognition errors and the OOV problems in ASR make it difficult for many query-related terms to be included in the expanded query, and much of the information carried by the speech signal is lost during recognition and not recoverable. In this paper, we propose to use a second ASR engine based on acoustic patterns automatically discovered from the spoken archive used for retrieval. These acoustic patterns are discovered directly based on the signal characteristics, and therefore can compensate for the information lost during recognition to a good extent. When a text query is entered, the system generates the first-pass retrieval results based on the transcriptions of the spoken segments obtained via the conventional ASR. The acoustic patterns frequently occurring in the spoken segments ranked on top of the first-pass results are considered as query-related, and the spoken segments containing these query-related acoustic patterns are retrieved. In this way, even though some query-related terms are OOV or incorrectly recognized, the segments including these terms can still be retrieved by acoustic patterns corresponding to these terms. Preliminary experiments performed on Mandarin broadcast news offered very encouraging results.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121165186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Bayesian robust adaptive beamforming based on random steering vector with bingham prior distribution 基于bingham先验分布随机导向矢量的贝叶斯鲁棒自适应波束形成

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-10-21 DOI: 10.1109/ICASSP.2013.6638367

O. Besson, S. Bidon

引用次数: 3

A nonlinear dictionary for image reconstruction 一个用于图像重建的非线性字典

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-10-21 DOI: 10.1109/ICASSP.2013.6638052

Mathiruban Tharmalingam, K. Raahemifar

引用次数: 4

Incoherent training of deep neural networks to de-correlate bottleneck features for speech recognition 语音识别中深度神经网络去相关瓶颈特征的非相干训练

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-10-21 DOI: 10.1109/ICASSP.2013.6639015

Y. Bao, Hui Jiang, Lirong Dai, Cong Liu

{"title":"Incoherent training of deep neural networks to de-correlate bottleneck features for speech recognition","authors":"Y. Bao, Hui Jiang, Lirong Dai, Cong Liu","doi":"10.1109/ICASSP.2013.6639015","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6639015","url":null,"abstract":"Recently, the hybrid model combining deep neural network (DNN) with context-dependent HMMs has achieved some dramatic gains over the conventional GMM/HMM method in many speech recognition tasks. In this paper, we study how to compete with the state-of-the-art DNN/HMM method under the traditional GMM/HMM framework. Instead of using DNN as acoustic model, we use DNN as a front-end bottleneck (BN) feature extraction method to decorrelate long feature vectors concatenated from several consecutive speech frames. More importantly, we have proposed two novel incoherent training methods to explicitly de-correlate BN features in learning of DNN. The first method relies on minimizing coherence of weight matrices in DNN while the second one attempts to minimize correlation coefficients of BN features calculated in each mini-batch data in DNN training. Experimental results on a 70-hr Mandarin transcription task and the 309-hr Switchboard task have shown that the traditional GMM/HMMs using BN features can yield comparable performance as DNN/HMM. The proposed incoherent training can produce 2-3% additional gain over the baseline BN features. At last, the discriminatively trained GMM/HMMs using incoherently trained BN features have consistently surpassed the state-of-the-art DNN/HMMs in all evaluated tasks.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129352683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 48

Coverage and area spectral efficiency in downlink random cellular networks with channel estimation error 具有信道估计误差的下行随机蜂窝网络的覆盖和区域频谱效率

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-10-21 DOI: 10.1109/ICASSP.2013.6638492

Yueping Wu, M. Mckay, R. Heath

引用次数: 9

Voice activity detection based on frequency modulation of harmonics 基于谐波调频的语音活动检测

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-10-21 DOI: 10.1109/ICASSP.2013.6638954

Chung-Chien Hsu, Tse-En Lin, Jian-Hueng Chen, T. Chi

引用次数: 12

Joint source-channel coding of 3D video using multiview coding 基于多视点编码的三维视频联合源信道编码

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-10-21 DOI: 10.1109/ICASSP.2013.6638014

Arash Vosoughi, Vanessa Testoni, P. Cosman, L. Milstein

引用次数: 9