基于Maxout神经元的深度双向LSTM声学建模

Yuan Luo, Yu Liu, Yi Zhang, Boyu Wang, Zhou Ye
{"title":"基于Maxout神经元的深度双向LSTM声学建模","authors":"Yuan Luo, Yu Liu, Yi Zhang, Boyu Wang, Zhou Ye","doi":"10.1109/ROBIO.2017.8324646","DOIUrl":null,"url":null,"abstract":"Recently long short-term memory (LSTM) recurrent neural networks (RNN) have achieved greater success in acoustic models for the large vocabulary continuous speech recognition system. In this paper, we propose an improved hybrid acoustic model based on deep bidirectional long short-term memory (DBLSTM) RNN. In this new acoustic model, maxout neurons are used in the fully-connected part of DBLSTM to solve the problems of vanishing and exploding gradient. At the same time, the dropout regularization algorithm is used to avoid the over-fitting during the training process of neural network. In addition, in order to adapt the bidirectional dependence of DBLSTM at each time step, a context-sensitive-chunk (CSC) back-propagation through time (BPTT) algorithm is proposed to train DBLSTM neural network. Simulation experiments have been made on Switchboard benchmark task. The results show that the WER of the improved hybrid acoustic model is 14.5%, and the optimal network structures and CSC configurations are given.","PeriodicalId":197159,"journal":{"name":"2017 IEEE International Conference on Robotics and Biomimetics (ROBIO)","volume":"454 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Maxout neurons based deep bidirectional LSTM for acoustic modeling\",\"authors\":\"Yuan Luo, Yu Liu, Yi Zhang, Boyu Wang, Zhou Ye\",\"doi\":\"10.1109/ROBIO.2017.8324646\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently long short-term memory (LSTM) recurrent neural networks (RNN) have achieved greater success in acoustic models for the large vocabulary continuous speech recognition system. In this paper, we propose an improved hybrid acoustic model based on deep bidirectional long short-term memory (DBLSTM) RNN. In this new acoustic model, maxout neurons are used in the fully-connected part of DBLSTM to solve the problems of vanishing and exploding gradient. At the same time, the dropout regularization algorithm is used to avoid the over-fitting during the training process of neural network. In addition, in order to adapt the bidirectional dependence of DBLSTM at each time step, a context-sensitive-chunk (CSC) back-propagation through time (BPTT) algorithm is proposed to train DBLSTM neural network. Simulation experiments have been made on Switchboard benchmark task. The results show that the WER of the improved hybrid acoustic model is 14.5%, and the optimal network structures and CSC configurations are given.\",\"PeriodicalId\":197159,\"journal\":{\"name\":\"2017 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"volume\":\"454 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROBIO.2017.8324646\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Robotics and Biomimetics (ROBIO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROBIO.2017.8324646","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

近年来,长短期记忆(LSTM)递归神经网络(RNN)在大词汇量连续语音识别系统的声学模型方面取得了较大的成功。本文提出了一种基于深度双向长短期记忆(DBLSTM) RNN的改进混合声学模型。在该声学模型中,在DBLSTM的全连接部分使用了maxout神经元来解决梯度消失和爆炸的问题。同时,采用dropout正则化算法,避免了神经网络在训练过程中的过拟合。此外,为了适应DBLSTM在每个时间步的双向依赖性,提出了一种上下文敏感块(CSC)随时间反向传播(BPTT)算法来训练DBLSTM神经网络。在总机基准任务上进行了仿真实验。结果表明,改进后的混合声学模型的噪声比为14.5%,并给出了最优的网络结构和CSC配置。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Maxout neurons based deep bidirectional LSTM for acoustic modeling
Recently long short-term memory (LSTM) recurrent neural networks (RNN) have achieved greater success in acoustic models for the large vocabulary continuous speech recognition system. In this paper, we propose an improved hybrid acoustic model based on deep bidirectional long short-term memory (DBLSTM) RNN. In this new acoustic model, maxout neurons are used in the fully-connected part of DBLSTM to solve the problems of vanishing and exploding gradient. At the same time, the dropout regularization algorithm is used to avoid the over-fitting during the training process of neural network. In addition, in order to adapt the bidirectional dependence of DBLSTM at each time step, a context-sensitive-chunk (CSC) back-propagation through time (BPTT) algorithm is proposed to train DBLSTM neural network. Simulation experiments have been made on Switchboard benchmark task. The results show that the WER of the improved hybrid acoustic model is 14.5%, and the optimal network structures and CSC configurations are given.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信