基于小波变换的阿拉伯语口语数字多波段识别

Proceedings of the Nineteenth National Radio Science Conference Pub Date : 2002-11-07 DOI:10.1109/NRSC.2002.1022626

W. Alkhaldi, W. Fakhr, N. Hamdy

{"title":"基于小波变换的阿拉伯语口语数字多波段识别","authors":"W. Alkhaldi, W. Fakhr, N. Hamdy","doi":"10.1109/NRSC.2002.1022626","DOIUrl":null,"url":null,"abstract":"Automatic speech recognition (ASR) using multi-band decomposition provides high recognition rates especially in noisy environments. The discrete wavelet transform (DWT) is known to be an efficient tool for decomposing signals into frequency sub-bands. The concept of feature recombination (FC) as applied to the recognition of spoken Arabic numerals is suggested. Utterances are decomposed using DWT before cepstral coefficients of the resulting sub-bands are calculated. The obtained coefficients are concatenated to form a single feature vector that is used as an input to the speech classifier, e.g. a hidden Markov model (HMM), to compute the likelihood. Simulation results have demonstrated that the achieved correct recognition rates using the suggested method are comparable with the full-band ASR (conventional) system.","PeriodicalId":231600,"journal":{"name":"Proceedings of the Nineteenth National Radio Science Conference","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Multi-band based recognition of spoken Arabic numerals using wavelet transform\",\"authors\":\"W. Alkhaldi, W. Fakhr, N. Hamdy\",\"doi\":\"10.1109/NRSC.2002.1022626\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic speech recognition (ASR) using multi-band decomposition provides high recognition rates especially in noisy environments. The discrete wavelet transform (DWT) is known to be an efficient tool for decomposing signals into frequency sub-bands. The concept of feature recombination (FC) as applied to the recognition of spoken Arabic numerals is suggested. Utterances are decomposed using DWT before cepstral coefficients of the resulting sub-bands are calculated. The obtained coefficients are concatenated to form a single feature vector that is used as an input to the speech classifier, e.g. a hidden Markov model (HMM), to compute the likelihood. Simulation results have demonstrated that the achieved correct recognition rates using the suggested method are comparable with the full-band ASR (conventional) system.\",\"PeriodicalId\":231600,\"journal\":{\"name\":\"Proceedings of the Nineteenth National Radio Science Conference\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Nineteenth National Radio Science Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NRSC.2002.1022626\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Nineteenth National Radio Science Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NRSC.2002.1022626","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

摘要

采用多频带分解的自动语音识别(ASR)在噪声环境下具有较高的识别率。众所周知，离散小波变换(DWT)是一种有效的将信号分解成子频带的工具。提出了特征重组的概念，并将其应用于阿拉伯语口语数字识别。在计算所得子带的倒谱系数之前，使用小波变换对语音进行分解。将得到的系数连接起来形成单个特征向量，作为语音分类器的输入，例如隐马尔可夫模型(HMM)，以计算似然。仿真结果表明，该方法的识别率与全波段ASR(传统)系统相当。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-band based recognition of spoken Arabic numerals using wavelet transform

Automatic speech recognition (ASR) using multi-band decomposition provides high recognition rates especially in noisy environments. The discrete wavelet transform (DWT) is known to be an efficient tool for decomposing signals into frequency sub-bands. The concept of feature recombination (FC) as applied to the recognition of spoken Arabic numerals is suggested. Utterances are decomposed using DWT before cepstral coefficients of the resulting sub-bands are calculated. The obtained coefficients are concatenated to form a single feature vector that is used as an input to the speech classifier, e.g. a hidden Markov model (HMM), to compute the likelihood. Simulation results have demonstrated that the achieved correct recognition rates using the suggested method are comparable with the full-band ASR (conventional) system.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Nineteenth National Radio Science Conference

自引率

0.00%

发文量