Deguo Mu, Tao Zhu, Guoliang Xu, Han Li, Dongbin Guo, Yongquan Liu
{"title":"Attention Based Speech Model for Japanese Recognization","authors":"Deguo Mu, Tao Zhu, Guoliang Xu, Han Li, Dongbin Guo, Yongquan Liu","doi":"10.1109/SmartIoT.2019.00071","DOIUrl":null,"url":null,"abstract":"The Deep Neural Networks have been used for the Automatic Speech Recognition recently, and they have achieved great improvement in accuracy. Especially, CNN (Convolutional Neural Networks) are used on Acoustic feature extraction, which not only improves the accuracy of speech recognition, but also the parallel efficiency. Attention mechanism has shown very good performance in sequence to sequence patterns. Based on Attention mechanism with CNN and LSTM (Long Short-Term Memory) speech recognition model, this paper takes the 10,000 Japanese sentences as examples for training. Without any the language model, the pronunciation accuracy of Japanese fifty-tone diagrams reaches 89%.","PeriodicalId":240441,"journal":{"name":"2019 IEEE International Conference on Smart Internet of Things (SmartIoT)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Smart Internet of Things (SmartIoT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SmartIoT.2019.00071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The Deep Neural Networks have been used for the Automatic Speech Recognition recently, and they have achieved great improvement in accuracy. Especially, CNN (Convolutional Neural Networks) are used on Acoustic feature extraction, which not only improves the accuracy of speech recognition, but also the parallel efficiency. Attention mechanism has shown very good performance in sequence to sequence patterns. Based on Attention mechanism with CNN and LSTM (Long Short-Term Memory) speech recognition model, this paper takes the 10,000 Japanese sentences as examples for training. Without any the language model, the pronunciation accuracy of Japanese fifty-tone diagrams reaches 89%.