基于MVA处理的阿拉伯语语音识别鲁棒前端

2017 International Conference on Engineering & MIS (ICEMIS) Pub Date : 2017-05-01 DOI:10.1109/ICEMIS.2017.8273064

Elhem Techini, Z. Sakka, M. Bouhlel

{"title":"基于MVA处理的阿拉伯语语音识别鲁棒前端","authors":"Elhem Techini, Z. Sakka, M. Bouhlel","doi":"10.1109/ICEMIS.2017.8273064","DOIUrl":null,"url":null,"abstract":"This paper presents a noise robust technique for arabic automatic speech recognition engine. The technique is based on Cepstral Mean and Variance Normalization (CMVN) plus Auto Regressive Moving Average (ARMA) filtering which is called MVA. MVA used as a post-processing module to Mel Frequency Cepstral Coefficients (MFCC), Relative Spectral-Perceptual Linear Prediction (RASTA-PLP) and Power Normalized Cepstral Coefficients (PNCC) features to improve the recognition accuracy. While an isolated Arabic word engine was designed and developed using the Hidden Markov Model (HMM) to perform the recognition process at the back-end. Experimental results on the Arabic database demonstrate that our method provides substantial improvements in recognition accuracy for all features. The results also demonstrate that RASTA-PLP outperforms PNCC and MFCC features for word correction and word accuracy.","PeriodicalId":117908,"journal":{"name":"2017 International Conference on Engineering & MIS (ICEMIS)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Robust front-end based on MVA processing for Arabic speech recognition\",\"authors\":\"Elhem Techini, Z. Sakka, M. Bouhlel\",\"doi\":\"10.1109/ICEMIS.2017.8273064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a noise robust technique for arabic automatic speech recognition engine. The technique is based on Cepstral Mean and Variance Normalization (CMVN) plus Auto Regressive Moving Average (ARMA) filtering which is called MVA. MVA used as a post-processing module to Mel Frequency Cepstral Coefficients (MFCC), Relative Spectral-Perceptual Linear Prediction (RASTA-PLP) and Power Normalized Cepstral Coefficients (PNCC) features to improve the recognition accuracy. While an isolated Arabic word engine was designed and developed using the Hidden Markov Model (HMM) to perform the recognition process at the back-end. Experimental results on the Arabic database demonstrate that our method provides substantial improvements in recognition accuracy for all features. The results also demonstrate that RASTA-PLP outperforms PNCC and MFCC features for word correction and word accuracy.\",\"PeriodicalId\":117908,\"journal\":{\"name\":\"2017 International Conference on Engineering & MIS (ICEMIS)\",\"volume\":\"77 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Engineering & MIS (ICEMIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEMIS.2017.8273064\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Engineering & MIS (ICEMIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEMIS.2017.8273064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

提出了一种用于阿拉伯语自动语音识别引擎的噪声鲁棒技术。该技术基于倒谱均值和方差归一化(CMVN)和自回归移动平均(ARMA)滤波，称为MVA。MVA作为后处理模块对Mel频谱倒谱系数(MFCC)、相对频谱感知线性预测(RASTA-PLP)和功率归一化倒谱系数(PNCC)特征进行处理，提高识别精度。同时，利用隐马尔可夫模型(HMM)设计并开发了一个孤立的阿拉伯语词引擎，在后端执行识别过程。在阿拉伯文数据库上的实验结果表明，我们的方法对所有特征的识别精度都有很大的提高。结果还表明，RASTA-PLP在单词更正和单词准确性方面优于PNCC和MFCC特征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Robust front-end based on MVA processing for Arabic speech recognition

This paper presents a noise robust technique for arabic automatic speech recognition engine. The technique is based on Cepstral Mean and Variance Normalization (CMVN) plus Auto Regressive Moving Average (ARMA) filtering which is called MVA. MVA used as a post-processing module to Mel Frequency Cepstral Coefficients (MFCC), Relative Spectral-Perceptual Linear Prediction (RASTA-PLP) and Power Normalized Cepstral Coefficients (PNCC) features to improve the recognition accuracy. While an isolated Arabic word engine was designed and developed using the Hidden Markov Model (HMM) to perform the recognition process at the back-end. Experimental results on the Arabic database demonstrate that our method provides substantial improvements in recognition accuracy for all features. The results also demonstrate that RASTA-PLP outperforms PNCC and MFCC features for word correction and word accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 International Conference on Engineering & MIS (ICEMIS)

自引率

0.00%

发文量