{"title":"Joint unsupervised adaptation of n-gram and RNN language models via LDA-based hybrid mixture modeling","authors":"Ryo Masumura, Taichi Asami, H. Masataki, Y. Aono","doi":"10.1109/APSIPA.2017.8282277","DOIUrl":null,"url":null,"abstract":"This paper reports an initial study of unsupervised adaptation that assumes simultaneous use of both n-gram and recurrent neural network (RNN) language models (LMs) in automatic speech recognition (ASR). It is known that a combination of n-grams and RNN LMs is a more effective approach to ASR than using each of them singly. However, unsupervised adaptation methods that simultaneously adapt both n-grams and RNN LMs have not been presented while various unsupervised adaptation methods specific to either n-gram LMs or RNN LMs have been examined. In order to handle different LMs in a unified unsupervised adaptation framework, our key idea is to introduce mixture modeling for both n-gram LMs and RNN LMs. The mixture modeling can simultaneously handle multiple LMs and unsupervised adaptation can be easily accomplished merely by adjusting their mixture weights using a recognition hypothesis of an input speech. This paper proposes joint unsupervised adaptation achieved by a hybrid mixture modeling using both n-gram mixture models and RNN mixture models. We present latent Dirichlet allocation based hybrid mixture modeling for effective topic adaptation. Our experiments in lecture ASR tasks show the effectiveness of joint unsupervised adaptation. We also reveal performance in which only one n-gram or RNN LM is adapted.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIPA.2017.8282277","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper reports an initial study of unsupervised adaptation that assumes simultaneous use of both n-gram and recurrent neural network (RNN) language models (LMs) in automatic speech recognition (ASR). It is known that a combination of n-grams and RNN LMs is a more effective approach to ASR than using each of them singly. However, unsupervised adaptation methods that simultaneously adapt both n-grams and RNN LMs have not been presented while various unsupervised adaptation methods specific to either n-gram LMs or RNN LMs have been examined. In order to handle different LMs in a unified unsupervised adaptation framework, our key idea is to introduce mixture modeling for both n-gram LMs and RNN LMs. The mixture modeling can simultaneously handle multiple LMs and unsupervised adaptation can be easily accomplished merely by adjusting their mixture weights using a recognition hypothesis of an input speech. This paper proposes joint unsupervised adaptation achieved by a hybrid mixture modeling using both n-gram mixture models and RNN mixture models. We present latent Dirichlet allocation based hybrid mixture modeling for effective topic adaptation. Our experiments in lecture ASR tasks show the effectiveness of joint unsupervised adaptation. We also reveal performance in which only one n-gram or RNN LM is adapted.