基于小波变换的阿拉伯语口语数字多波段识别

W. Alkhaldi, W. Fakhr, N. Hamdy
{"title":"基于小波变换的阿拉伯语口语数字多波段识别","authors":"W. Alkhaldi, W. Fakhr, N. Hamdy","doi":"10.1109/NRSC.2002.1022626","DOIUrl":null,"url":null,"abstract":"Automatic speech recognition (ASR) using multi-band decomposition provides high recognition rates especially in noisy environments. The discrete wavelet transform (DWT) is known to be an efficient tool for decomposing signals into frequency sub-bands. The concept of feature recombination (FC) as applied to the recognition of spoken Arabic numerals is suggested. Utterances are decomposed using DWT before cepstral coefficients of the resulting sub-bands are calculated. The obtained coefficients are concatenated to form a single feature vector that is used as an input to the speech classifier, e.g. a hidden Markov model (HMM), to compute the likelihood. Simulation results have demonstrated that the achieved correct recognition rates using the suggested method are comparable with the full-band ASR (conventional) system.","PeriodicalId":231600,"journal":{"name":"Proceedings of the Nineteenth National Radio Science Conference","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Multi-band based recognition of spoken Arabic numerals using wavelet transform\",\"authors\":\"W. Alkhaldi, W. Fakhr, N. Hamdy\",\"doi\":\"10.1109/NRSC.2002.1022626\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic speech recognition (ASR) using multi-band decomposition provides high recognition rates especially in noisy environments. The discrete wavelet transform (DWT) is known to be an efficient tool for decomposing signals into frequency sub-bands. The concept of feature recombination (FC) as applied to the recognition of spoken Arabic numerals is suggested. Utterances are decomposed using DWT before cepstral coefficients of the resulting sub-bands are calculated. The obtained coefficients are concatenated to form a single feature vector that is used as an input to the speech classifier, e.g. a hidden Markov model (HMM), to compute the likelihood. Simulation results have demonstrated that the achieved correct recognition rates using the suggested method are comparable with the full-band ASR (conventional) system.\",\"PeriodicalId\":231600,\"journal\":{\"name\":\"Proceedings of the Nineteenth National Radio Science Conference\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Nineteenth National Radio Science Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NRSC.2002.1022626\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Nineteenth National Radio Science Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NRSC.2002.1022626","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

采用多频带分解的自动语音识别(ASR)在噪声环境下具有较高的识别率。众所周知,离散小波变换(DWT)是一种有效的将信号分解成子频带的工具。提出了特征重组的概念,并将其应用于阿拉伯语口语数字识别。在计算所得子带的倒谱系数之前,使用小波变换对语音进行分解。将得到的系数连接起来形成单个特征向量,作为语音分类器的输入,例如隐马尔可夫模型(HMM),以计算似然。仿真结果表明,该方法的识别率与全波段ASR(传统)系统相当。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-band based recognition of spoken Arabic numerals using wavelet transform
Automatic speech recognition (ASR) using multi-band decomposition provides high recognition rates especially in noisy environments. The discrete wavelet transform (DWT) is known to be an efficient tool for decomposing signals into frequency sub-bands. The concept of feature recombination (FC) as applied to the recognition of spoken Arabic numerals is suggested. Utterances are decomposed using DWT before cepstral coefficients of the resulting sub-bands are calculated. The obtained coefficients are concatenated to form a single feature vector that is used as an input to the speech classifier, e.g. a hidden Markov model (HMM), to compute the likelihood. Simulation results have demonstrated that the achieved correct recognition rates using the suggested method are comparable with the full-band ASR (conventional) system.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信