{"title":"Improving the Accuracy of Large Vocabulary Continuous Speech Recognizer Using Dependency Parse Tree and Chomsky Hierarchy in Lattice Rescoring","authors":"Kai Sze Hong, T. Tan, E. Tang","doi":"10.1109/IALP.2013.53","DOIUrl":null,"url":null,"abstract":"This research work describes our approaches in using dependency parse tree information to derive useful hidden word statistics to improve the baseline system of Malay large vocabulary automatic speech recognition system. The traditional approaches to train language model are mainly based on Chomsky hierarchy type 3 that approximates natural language as regular language. This approach ignores the characteristics of natural language. Our work attempted to overcome these limitations by extending the approach to consider Chomsky hierarchy type 1 and type 2. We extracted the dependency tree based lexical information and incorporate the information into the language model. The second pass lattice rescoring was performed to produce better hypotheses for Malay large vocabulary continuous speech recognition system. The absolute WER reduction was 2.2% and 3.8% for MASS and MASS-NEWS Corpus, respectively.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Asian Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IALP.2013.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This research work describes our approaches in using dependency parse tree information to derive useful hidden word statistics to improve the baseline system of Malay large vocabulary automatic speech recognition system. The traditional approaches to train language model are mainly based on Chomsky hierarchy type 3 that approximates natural language as regular language. This approach ignores the characteristics of natural language. Our work attempted to overcome these limitations by extending the approach to consider Chomsky hierarchy type 1 and type 2. We extracted the dependency tree based lexical information and incorporate the information into the language model. The second pass lattice rescoring was performed to produce better hypotheses for Malay large vocabulary continuous speech recognition system. The absolute WER reduction was 2.2% and 3.8% for MASS and MASS-NEWS Corpus, respectively.