{"title":"Continuous Kannada Noisy Speech Recognition","authors":"Nadeem Pasha, R. S","doi":"10.1109/ICRIEECE44171.2018.9009108","DOIUrl":null,"url":null,"abstract":"ASR converts speech signal into corresponding text form. The performance of an ASR decreases under noisy environment. To overcome this problem a speech enhancement need to be performed on noisy speech before being fed to an ASR system. Speech enhancement techniques have been developed over past several decades, some of these techniques introduce musical noise. To achieve further improvement in recognition accuracy, a generalized distillation framework is used in which machines learns machines. In this paper, an ASR is implemented for noisy kannada language speech using generalized distillation framework. In this framework, a teacher machine is trained with clean speech and student machine with 4 different noise speech and teacher machine help student machine to learn by providing additional information needed. During test phase, a student machine is tested with 4 different noise speech other than used in training. A DNN acoustic model is build using a 39 dimension MFSC features and bi-gram language model is created using Kaldi Speech Recognition Toolkit. Experimental results shows that generalized distillation framework for kannada noisy speech achieved a reduction in WER compared to an HMM-GMM approach.","PeriodicalId":393891,"journal":{"name":"2018 International Conference on Recent Innovations in Electrical, Electronics & Communication Engineering (ICRIEECE)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Recent Innovations in Electrical, Electronics & Communication Engineering (ICRIEECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRIEECE44171.2018.9009108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
ASR converts speech signal into corresponding text form. The performance of an ASR decreases under noisy environment. To overcome this problem a speech enhancement need to be performed on noisy speech before being fed to an ASR system. Speech enhancement techniques have been developed over past several decades, some of these techniques introduce musical noise. To achieve further improvement in recognition accuracy, a generalized distillation framework is used in which machines learns machines. In this paper, an ASR is implemented for noisy kannada language speech using generalized distillation framework. In this framework, a teacher machine is trained with clean speech and student machine with 4 different noise speech and teacher machine help student machine to learn by providing additional information needed. During test phase, a student machine is tested with 4 different noise speech other than used in training. A DNN acoustic model is build using a 39 dimension MFSC features and bi-gram language model is created using Kaldi Speech Recognition Toolkit. Experimental results shows that generalized distillation framework for kannada noisy speech achieved a reduction in WER compared to an HMM-GMM approach.