A Dropout-Based Single Model Committee Approach for Active Learning in ASR

Jiayi Fu, Kuang Ru
{"title":"A Dropout-Based Single Model Committee Approach for Active Learning in ASR","authors":"Jiayi Fu, Kuang Ru","doi":"10.1109/ASRU46091.2019.9003728","DOIUrl":null,"url":null,"abstract":"In this paper, we proposed a new committee-based approach for active learning (AL) in automatic speech recognition (ASR). This approach can achieve lower recognition word error rate (WER) with fewer transcription by selecting the most informative samples. Different from previous committee-based AL approaches, the committee construction process of this approach needs to train only one acoustic model(AM) with dropout. Since only one model needs to be trained, this approach is simpler and faster. At the same time, the AM will be improved continuously, we also found this approach is more robust to its improvement. In experiments, we compared our approach with the random sampling and another state-of-the-art committee-based approach: heterogeneous neural networks (HNN) based approach. We examined our approach in WER, the time to construct committee and the robustness of model improvement in the Mandarin ASR task with 1600 hours speech data. The results showed that it achieves 2–3 times relative WER reduction compare with the random sampling, and it only uses 75% the time to achieve close WER with HNN-based approach.","PeriodicalId":150913,"journal":{"name":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU46091.2019.9003728","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we proposed a new committee-based approach for active learning (AL) in automatic speech recognition (ASR). This approach can achieve lower recognition word error rate (WER) with fewer transcription by selecting the most informative samples. Different from previous committee-based AL approaches, the committee construction process of this approach needs to train only one acoustic model(AM) with dropout. Since only one model needs to be trained, this approach is simpler and faster. At the same time, the AM will be improved continuously, we also found this approach is more robust to its improvement. In experiments, we compared our approach with the random sampling and another state-of-the-art committee-based approach: heterogeneous neural networks (HNN) based approach. We examined our approach in WER, the time to construct committee and the robustness of model improvement in the Mandarin ASR task with 1600 hours speech data. The results showed that it achieves 2–3 times relative WER reduction compare with the random sampling, and it only uses 75% the time to achieve close WER with HNN-based approach.
基于辍学的ASR主动学习单模型委员会方法
在本文中,我们提出了一种新的基于委员会的自动语音识别(ASR)主动学习(AL)方法。该方法通过选择信息量最大的样本,以较少的转录量实现较低的识别词错误率。与以往基于委员会的人工智能方法不同,该方法的委员会构建过程只需要训练一个带有dropout的声学模型(AM)。由于只需要训练一个模型,因此这种方法更简单、更快。同时,AM将不断改进,我们也发现这种方法对其改进更加稳健。在实验中,我们将我们的方法与随机抽样和另一种最先进的基于委员会的方法:基于异构神经网络(HNN)的方法进行了比较。我们用1600小时的语音数据检验了我们在WER中的方法、构建委员会的时间和模型改进在普通话ASR任务中的稳健性。结果表明,与随机抽样相比,该方法的相对WER降低了2-3倍,而基于hnn的方法只需要75%的时间就能达到接近的WER。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信