2009 IEEE Workshop on Automatic Speech Recognition & Understanding最新文献_第8页

Using temporal information for improving articulatory-acoustic feature classification 利用时间信息改进发音声学特征分类

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373314

Barbara Schuppler, Joost van Doremalen, O. Scharenborg, B. Cranen, L. Boves

引用次数: 9

Investigations on features for log-linear acoustic models in continuous speech recognition 连续语音识别中对数线性声学模型的特征研究

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373362

Simon Wiesler, M. Nußbaum-Thom, G. Heigold, R. Schlüter, H. Ney

{"title":"Investigations on features for log-linear acoustic models in continuous speech recognition","authors":"Simon Wiesler, M. Nußbaum-Thom, G. Heigold, R. Schlüter, H. Ney","doi":"10.1109/ASRU.2009.5373362","DOIUrl":"https://doi.org/10.1109/ASRU.2009.5373362","url":null,"abstract":"Hidden Markov Models with Gaussian Mixture Models as emission probabilities (GHMMs) are the underlying structure of all state-of-the-art speech recognition systems. Using Gaussian mixture distributions follows the generative approach where the class-conditional probability is modeled, although for classification only the posterior probability is needed. Though being very successful in related tasks like Natural Language Processing (NLP), in speech recognition direct modeling of posterior probabilities with log-linear models has rarely been used and has not been applied successfully to continuous speech recognition. In this paper we report competitive results for a speech recognizer with a log-linear acoustic model on the Wall Street Journal corpus, a Large Vocabulary Continuous Speech Recognition (LVCSR) task. We trained this model from scratch, i.e. without relying on an existing GHMM system. Previously the use of data dependent sparse features for log-linear models has been proposed. We compare them with polynomial features and show that the combination of polynomial and data dependent sparse features leads to better results.","PeriodicalId":292194,"journal":{"name":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125881899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

Multi-view learning of acoustic features for speaker recognition 说话人识别声学特征的多视角学习

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373462

Karen Livescu, Mark Stoehr

引用次数: 24

Kernel metric learning for phonetic classification 语音分类的核度量学习

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373389

J. Huang, Xi Zhou, M. Hasegawa-Johnson, Thomas S. Huang

引用次数: 2

Transition features for CRF-based speech recognition and boundary detection 基于crf的语音识别和边界检测的过渡特征

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373287

Spiros Dimopoulos, E. Fosler-Lussier, Chin-Hui Lee, A. Potamianos

引用次数: 1

Robust vocabulary independent keyword spotting with graphical models 具有图形模型的鲁棒词汇独立关键字发现

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373544

M. Wöllmer, F. Eyben, Björn Schuller, G. Rigoll

引用次数: 19

Discriminative adaptive training with VTS and JUD 基于VTS和JUD的判别适应性训练

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373266

F. Flego, M. Gales

引用次数: 23

Garbage modeling with decoys for a sequential recognition scenario 针对顺序识别场景的带有诱饵的垃圾建模

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5372919

Michael Levit, Shuangyu Chang, B. Buntschuh

引用次数: 13

Automatic selection of recognition errors by respeaking the intended text 通过说出预期的文本自动选择识别错误

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5373347

K. Vertanen, P. Kristensson

{"title":"Automatic selection of recognition errors by respeaking the intended text","authors":"K. Vertanen, P. Kristensson","doi":"10.1109/ASRU.2009.5373347","DOIUrl":"https://doi.org/10.1109/ASRU.2009.5373347","url":null,"abstract":"We investigate how to automatically align spoken corrections with an initial speech recognition result. Such automatic alignment would enable one-step voice-only correction in which users simply respeak their intended text. We present three new models for automatically aligning corrections: a 1-best model, a word confusion network model, and a revision model. The revision model allows users to alter what they intended to write even when the initial recognition was completely correct. We evaluate our models with data gathered from two user studies. We show that providing just a single correct word of context dramatically improves alignment success from 65% to 84%. We find that a majority of users provide such context without being explicitly instructed to do so. We find that the revision model is superior when users modify words in their initial recognition, improving alignment success from 73% to 83%. We show how our models can easily incorporate prior information about correction location and we show that such information aids alignment success. Last, we observe that users speak their intended text faster and with fewer re-recordings than if they are forced to speak misrecognized text.","PeriodicalId":292194,"journal":{"name":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129630941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Robust distributed speech recognition using two-stage Filtered Minima Controlled Recursive Averaging 基于两阶段滤波最小控制递归平均的鲁棒分布式语音识别

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI: 10.1109/ASRU.2009.5372925

Negar Ghourchian, S. Selouani, D. O'Shaughnessy

引用次数: 4