{"title":"基于小波变换的阿拉伯语口语数字多波段识别","authors":"W. Alkhaldi, W. Fakhr, N. Hamdy","doi":"10.1109/NRSC.2002.1022626","DOIUrl":null,"url":null,"abstract":"Automatic speech recognition (ASR) using multi-band decomposition provides high recognition rates especially in noisy environments. The discrete wavelet transform (DWT) is known to be an efficient tool for decomposing signals into frequency sub-bands. The concept of feature recombination (FC) as applied to the recognition of spoken Arabic numerals is suggested. Utterances are decomposed using DWT before cepstral coefficients of the resulting sub-bands are calculated. The obtained coefficients are concatenated to form a single feature vector that is used as an input to the speech classifier, e.g. a hidden Markov model (HMM), to compute the likelihood. Simulation results have demonstrated that the achieved correct recognition rates using the suggested method are comparable with the full-band ASR (conventional) system.","PeriodicalId":231600,"journal":{"name":"Proceedings of the Nineteenth National Radio Science Conference","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Multi-band based recognition of spoken Arabic numerals using wavelet transform\",\"authors\":\"W. Alkhaldi, W. Fakhr, N. Hamdy\",\"doi\":\"10.1109/NRSC.2002.1022626\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic speech recognition (ASR) using multi-band decomposition provides high recognition rates especially in noisy environments. The discrete wavelet transform (DWT) is known to be an efficient tool for decomposing signals into frequency sub-bands. The concept of feature recombination (FC) as applied to the recognition of spoken Arabic numerals is suggested. Utterances are decomposed using DWT before cepstral coefficients of the resulting sub-bands are calculated. The obtained coefficients are concatenated to form a single feature vector that is used as an input to the speech classifier, e.g. a hidden Markov model (HMM), to compute the likelihood. Simulation results have demonstrated that the achieved correct recognition rates using the suggested method are comparable with the full-band ASR (conventional) system.\",\"PeriodicalId\":231600,\"journal\":{\"name\":\"Proceedings of the Nineteenth National Radio Science Conference\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Nineteenth National Radio Science Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NRSC.2002.1022626\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Nineteenth National Radio Science Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NRSC.2002.1022626","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-band based recognition of spoken Arabic numerals using wavelet transform
Automatic speech recognition (ASR) using multi-band decomposition provides high recognition rates especially in noisy environments. The discrete wavelet transform (DWT) is known to be an efficient tool for decomposing signals into frequency sub-bands. The concept of feature recombination (FC) as applied to the recognition of spoken Arabic numerals is suggested. Utterances are decomposed using DWT before cepstral coefficients of the resulting sub-bands are calculated. The obtained coefficients are concatenated to form a single feature vector that is used as an input to the speech classifier, e.g. a hidden Markov model (HMM), to compute the likelihood. Simulation results have demonstrated that the achieved correct recognition rates using the suggested method are comparable with the full-band ASR (conventional) system.