Unsupervised training of acoustic models for large vocabulary continuous speech recognition

F. Wessel, H. Ney
{"title":"Unsupervised training of acoustic models for large vocabulary continuous speech recognition","authors":"F. Wessel, H. Ney","doi":"10.1109/ASRU.2001.1034648","DOIUrl":null,"url":null,"abstract":"For speech recognition systems, the amount of acoustic training data is of crucial importance. In the past, large amounts of speech were recorded and transcribed manually for training. Since untranscribed speech is available in various forms these days, the unsupervised training of a speech recognizer on recognized transcriptions is studied. A low-cost recognizer trained with only one hour of manually transcribed speech is used to recognize 72 hours of untranscribed acoustic data. These transcriptions are then used in combination with confidence measures to train an improved recognizer. The effect of confidence measures which are used to detect possible recognition errors is studied systematically. Finally, the unsupervised training is applied iteratively. Using this method, the recognizer is trained with very little manual effort while losing only 14.3% relative on the Broadcast News '96 and 18.6% relative on the Broadcast News '98 evaluation test sets.","PeriodicalId":118671,"journal":{"name":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2001.1034648","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

Abstract

For speech recognition systems, the amount of acoustic training data is of crucial importance. In the past, large amounts of speech were recorded and transcribed manually for training. Since untranscribed speech is available in various forms these days, the unsupervised training of a speech recognizer on recognized transcriptions is studied. A low-cost recognizer trained with only one hour of manually transcribed speech is used to recognize 72 hours of untranscribed acoustic data. These transcriptions are then used in combination with confidence measures to train an improved recognizer. The effect of confidence measures which are used to detect possible recognition errors is studied systematically. Finally, the unsupervised training is applied iteratively. Using this method, the recognizer is trained with very little manual effort while losing only 14.3% relative on the Broadcast News '96 and 18.6% relative on the Broadcast News '98 evaluation test sets.
大词汇量连续语音识别声学模型的无监督训练
对于语音识别系统来说,声学训练数据的数量是至关重要的。过去,为了训练,大量的语音都是手工录制和转录的。由于目前存在多种形式的未转录语音,因此研究了语音识别器对已识别语音的无监督训练。一个只经过一小时人工转录语音训练的低成本识别器被用来识别72小时未转录的声学数据。然后将这些转录与置信度措施结合使用来训练改进的识别器。系统地研究了用于检测可能的识别错误的置信度度量的效果。最后,迭代地应用无监督训练。使用这种方法,识别器只需要很少的人工训练,而在Broadcast News '96和Broadcast News '98评估测试集上的相对损失仅为14.3%和18.6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信