2011 IEEE Workshop on Automatic Speech Recognition & Understanding最新文献

筛选
英文 中文
Fast and flexible Kullback-Leibler divergence based acoustic modeling for non-native speech recognition 基于Kullback-Leibler散度的非母语语音识别声学建模
2011 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2011-12-01 DOI: 10.1109/ASRU.2011.6163956
David Imseng, Ramya Rasipuram, M. Magimai.-Doss
{"title":"Fast and flexible Kullback-Leibler divergence based acoustic modeling for non-native speech recognition","authors":"David Imseng, Ramya Rasipuram, M. Magimai.-Doss","doi":"10.1109/ASRU.2011.6163956","DOIUrl":"https://doi.org/10.1109/ASRU.2011.6163956","url":null,"abstract":"One of the main challenge in non-native speech recognition is how to handle acoustic variability present in multi-accented non-native speech with limited amount of training data. In this paper, we investigate an approach that addresses this challenge by using Kullback-Leibler divergence based hidden Markov models (KL-HMM). More precisely, the acoustic variability in the multi-accented speech is handled by using multilingual phoneme posterior probabilities, estimated by a multilayer perceptron trained on auxiliary data, as input feature for the KL-HMM system. With limited training data, we then build better acoustic models by exploiting the advantage that the KL-HMM system has fewer number of parameters. On HIWIRE corpus, the proposed approach yields a performance of 1.9% word error rate (WER) with 149 minutes of training data and a performance of 5.5% WER with 2 minutes of training data.","PeriodicalId":338241,"journal":{"name":"2011 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125344077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Strategies for training large scale neural network language models 大规模神经网络语言模型的训练策略
2011 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2011-12-01 DOI: 10.1109/ASRU.2011.6163930
Tomas Mikolov, Anoop Deoras, Daniel Povey, L. Burget, J. Černocký
{"title":"Strategies for training large scale neural network language models","authors":"Tomas Mikolov, Anoop Deoras, Daniel Povey, L. Burget, J. Černocký","doi":"10.1109/ASRU.2011.6163930","DOIUrl":"https://doi.org/10.1109/ASRU.2011.6163930","url":null,"abstract":"We describe how to effectively train neural network based language models on large data sets. Fast convergence during training and better overall performance is observed when the training data are sorted by their relevance. We introduce hash-based implementation of a maximum entropy model, that can be trained as a part of the neural network model. This leads to significant reduction of computational complexity. We achieved around 10% relative reduction of word error rate on English Broadcast News speech recognition task, against large 4-gram model trained on 400M tokens.","PeriodicalId":338241,"journal":{"name":"2011 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122277158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 528
Derivative kernels for noise robust ASR 噪声鲁棒ASR的导数核
2011 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2011-12-01 DOI: 10.1109/ASRU.2011.6163916
A. Ragni, M. Gales
{"title":"Derivative kernels for noise robust ASR","authors":"A. Ragni, M. Gales","doi":"10.1109/ASRU.2011.6163916","DOIUrl":"https://doi.org/10.1109/ASRU.2011.6163916","url":null,"abstract":"Recently there has been interest in combining generative and discriminative classifiers. In these classifiers features for the discriminative models are derived from the generative kernels. One advantage of using generative kernels is that systematic approaches exist to introduce complex dependencies into the feature-space. Furthermore, as the features are based on generative models standard model-based compensation and adaptation techniques can be applied to make discriminative models robust to noise and speaker conditions. This paper extends previous work in this framework in several directions. First, it introduces derivative kernels based on context-dependent generative models. Second, it describes how derivative kernels can be incorporated in structured discriminative models. Third, it addresses the issues associated with large number of classes and parameters when context-dependent models and high-dimensional feature-spaces of derivative kernels are used. The approach is evaluated on two noise-corrupted tasks: small vocabulary AURORA 2 and medium-to-large vocabulary AURORA 4 task.","PeriodicalId":338241,"journal":{"name":"2011 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121553997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Sparse Maximum A Posteriori adaptation 稀疏最大值后验自适应
2011 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2011-10-28 DOI: 10.1109/ASRU.2011.6163905
P. Olsen, Jing Huang, V. Goel, Steven J. Rennie
{"title":"Sparse Maximum A Posteriori adaptation","authors":"P. Olsen, Jing Huang, V. Goel, Steven J. Rennie","doi":"10.1109/ASRU.2011.6163905","DOIUrl":"https://doi.org/10.1109/ASRU.2011.6163905","url":null,"abstract":"Maximum A Posteriori (MAP) adaptation is a powerful tool for building speaker specific acoustic models. Modern speech applications utilize acoustic models with millions of parameters, and serve millions of users. Storing an acoustic model for each user in such settings is costly. However, speaker specific acoustic models are generally similar to the acoustic model being adapted. By imposing sparseness constraints, we can save significantly on storage, and even improve the quality of the resulting speaker-dependent model. In this paper we utilize the ℓ1 or ℓ0 norm as a regularizer to induce sparsity. We show that we can obtain up to 95% sparsity with negligible loss in recognition accuracy, with both penalties. By removing small differences, which constitute “adaptation noise”, sparse MAP is actually able to improve upon MAP adaptation. Sparse MAP reduces the MAP word error rate by 2% relative at 89% sparsity.","PeriodicalId":338241,"journal":{"name":"2011 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114593672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
An hierarchical exemplar-based sparse model of speech, with an application to ASR 基于分层样例的语音稀疏模型,并在ASR中的应用
2011 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 1900-01-01 DOI: 10.1109/ASRU.2011.6163913
J. Gemmeke, H. V. hamme
{"title":"An hierarchical exemplar-based sparse model of speech, with an application to ASR","authors":"J. Gemmeke, H. V. hamme","doi":"10.1109/ASRU.2011.6163913","DOIUrl":"https://doi.org/10.1109/ASRU.2011.6163913","url":null,"abstract":"We propose a hierarchical exemplar-based model of speech, as well as a new algorithm, to efficiently find sparse linear combinations of exemplars in dictionaries containing hundreds of thousands exemplars. We use a variant of hierarchical agglomerative clustering to find a hierarchy connecting all exemplars, so that each exemplar is a parent to two child nodes. We use a modified version of a multiplicative-updates based algorithm to find sparse representations starting from a small active set of exemplars from the dictionary. Namely, on each iteration we replace exemplars that have an increasing weight by their child-nodes. We illustrate the properties of the proposed method by investigating computational effort, accuracy of the eventual sparse representation and speech recognition accuracy on a digit recognition task.","PeriodicalId":338241,"journal":{"name":"2011 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129430424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信