{"title":"使用噪声抑制方法的鲁棒语音识别","authors":"E. Khan, R. Levinson","doi":"10.1109/IJSIS.1998.685469","DOIUrl":null,"url":null,"abstract":"In this paper, we explore some new approaches to improve speech recognition accuracy in a noisy environment. The key approaches taken are: (a) use no additional data (i.e. use only speakers data, no data for noise) for training and (b) no adaptation phase for noise. Instead of making adaptation in the recognition, preprocessing or both stages, we make a noise tolerant (rejection) speech recognition system where the system tries to reject noise automatically because of its inherent structure. We call our approach a noise rejection-based approach. Noise rejection is achieved by using multiple views and dynamic features of the input sequences. Multiple views exploit more information from the available data that is used for training multiple HMMs (hidden Markov models). This makes the training process simpler, faster and avoids the need to use a noise database, which is often difficult to obtain. The dynamic features (added to the HMM using vector emission probabilities) add more information about the input speech during training. Since the values of dynamic features of noise are usually much smaller than that of the speech signal, it helps reject the noise during recognition. Also, multiple views of the input sequence are applied to multiple HMMs during recognition and the outcome of the multiple HMMs are combined using maximum evidence criterion. Our tests show very encouraging results. We also incorporate higher level decision making to more judiciously combine the outcomes of the multiple HMMs to further improve the accuracy. For this, we use meta reasoning to identify the problem complexity and accordingly allocate resources.","PeriodicalId":289764,"journal":{"name":"Proceedings. IEEE International Joint Symposia on Intelligence and Systems (Cat. No.98EX174)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Robust speech recognition using a noise rejection approach\",\"authors\":\"E. Khan, R. Levinson\",\"doi\":\"10.1109/IJSIS.1998.685469\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we explore some new approaches to improve speech recognition accuracy in a noisy environment. The key approaches taken are: (a) use no additional data (i.e. use only speakers data, no data for noise) for training and (b) no adaptation phase for noise. Instead of making adaptation in the recognition, preprocessing or both stages, we make a noise tolerant (rejection) speech recognition system where the system tries to reject noise automatically because of its inherent structure. We call our approach a noise rejection-based approach. Noise rejection is achieved by using multiple views and dynamic features of the input sequences. Multiple views exploit more information from the available data that is used for training multiple HMMs (hidden Markov models). This makes the training process simpler, faster and avoids the need to use a noise database, which is often difficult to obtain. The dynamic features (added to the HMM using vector emission probabilities) add more information about the input speech during training. Since the values of dynamic features of noise are usually much smaller than that of the speech signal, it helps reject the noise during recognition. Also, multiple views of the input sequence are applied to multiple HMMs during recognition and the outcome of the multiple HMMs are combined using maximum evidence criterion. Our tests show very encouraging results. We also incorporate higher level decision making to more judiciously combine the outcomes of the multiple HMMs to further improve the accuracy. For this, we use meta reasoning to identify the problem complexity and accordingly allocate resources.\",\"PeriodicalId\":289764,\"journal\":{\"name\":\"Proceedings. IEEE International Joint Symposia on Intelligence and Systems (Cat. No.98EX174)\",\"volume\":\"104 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1998-03-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE International Joint Symposia on Intelligence and Systems (Cat. No.98EX174)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IJSIS.1998.685469\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Joint Symposia on Intelligence and Systems (Cat. No.98EX174)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJSIS.1998.685469","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Robust speech recognition using a noise rejection approach
In this paper, we explore some new approaches to improve speech recognition accuracy in a noisy environment. The key approaches taken are: (a) use no additional data (i.e. use only speakers data, no data for noise) for training and (b) no adaptation phase for noise. Instead of making adaptation in the recognition, preprocessing or both stages, we make a noise tolerant (rejection) speech recognition system where the system tries to reject noise automatically because of its inherent structure. We call our approach a noise rejection-based approach. Noise rejection is achieved by using multiple views and dynamic features of the input sequences. Multiple views exploit more information from the available data that is used for training multiple HMMs (hidden Markov models). This makes the training process simpler, faster and avoids the need to use a noise database, which is often difficult to obtain. The dynamic features (added to the HMM using vector emission probabilities) add more information about the input speech during training. Since the values of dynamic features of noise are usually much smaller than that of the speech signal, it helps reject the noise during recognition. Also, multiple views of the input sequence are applied to multiple HMMs during recognition and the outcome of the multiple HMMs are combined using maximum evidence criterion. Our tests show very encouraging results. We also incorporate higher level decision making to more judiciously combine the outcomes of the multiple HMMs to further improve the accuracy. For this, we use meta reasoning to identify the problem complexity and accordingly allocate resources.