{"title":"应用基于时间相关的递归多层神经网络对基于mel的倒谱系数进行鲁棒自动语音识别","authors":"M. Héon, H. Tolba, D. O'Shaughnessy","doi":"10.21437/ICSLP.1998-328","DOIUrl":null,"url":null,"abstract":"In this paper, the problem of robust speech recognition has been considered. Our approach is based on the noise reduction of the parameters that we use for recognition, that is, the Mel-based cepstral coefficients. A Temporal-Correlation-Based Recurrent Multilayer Neural Network (TCRMNN) for noise reduction in the cepstral domain is used in order to get less-variant parameters to be useful for robust recognition in noisy environments. Experiments show that the use of the enhanced parameters using such an approach increases the recognition rate of the continuous speech recognition (CSR) process. The HTK Hidden Markov Model Toolkit was used throughout. Experiments were done on a noisy version of the TIMIT database. With such a pre-processing noise reduction technique in the front-end of the HTK-based continuous speech recognition system (CSR) system, improvements in the recognition accuracy of about 17.77% and 18.58% using single mixture monophones and triphones, respectively, have been obtained at a moderate SNR of 20 dB.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Robust automatic speech recognition by the application of a temporal-correlation-based recurrent multilayer neural network to the mel-based cepstral coefficients\",\"authors\":\"M. Héon, H. Tolba, D. O'Shaughnessy\",\"doi\":\"10.21437/ICSLP.1998-328\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, the problem of robust speech recognition has been considered. Our approach is based on the noise reduction of the parameters that we use for recognition, that is, the Mel-based cepstral coefficients. A Temporal-Correlation-Based Recurrent Multilayer Neural Network (TCRMNN) for noise reduction in the cepstral domain is used in order to get less-variant parameters to be useful for robust recognition in noisy environments. Experiments show that the use of the enhanced parameters using such an approach increases the recognition rate of the continuous speech recognition (CSR) process. The HTK Hidden Markov Model Toolkit was used throughout. Experiments were done on a noisy version of the TIMIT database. With such a pre-processing noise reduction technique in the front-end of the HTK-based continuous speech recognition system (CSR) system, improvements in the recognition accuracy of about 17.77% and 18.58% using single mixture monophones and triphones, respectively, have been obtained at a moderate SNR of 20 dB.\",\"PeriodicalId\":117113,\"journal\":{\"name\":\"5th International Conference on Spoken Language Processing (ICSLP 1998)\",\"volume\":\"120 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1998-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"5th International Conference on Spoken Language Processing (ICSLP 1998)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21437/ICSLP.1998-328\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"5th International Conference on Spoken Language Processing (ICSLP 1998)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/ICSLP.1998-328","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Robust automatic speech recognition by the application of a temporal-correlation-based recurrent multilayer neural network to the mel-based cepstral coefficients
In this paper, the problem of robust speech recognition has been considered. Our approach is based on the noise reduction of the parameters that we use for recognition, that is, the Mel-based cepstral coefficients. A Temporal-Correlation-Based Recurrent Multilayer Neural Network (TCRMNN) for noise reduction in the cepstral domain is used in order to get less-variant parameters to be useful for robust recognition in noisy environments. Experiments show that the use of the enhanced parameters using such an approach increases the recognition rate of the continuous speech recognition (CSR) process. The HTK Hidden Markov Model Toolkit was used throughout. Experiments were done on a noisy version of the TIMIT database. With such a pre-processing noise reduction technique in the front-end of the HTK-based continuous speech recognition system (CSR) system, improvements in the recognition accuracy of about 17.77% and 18.58% using single mixture monophones and triphones, respectively, have been obtained at a moderate SNR of 20 dB.