基于n -1高斯MFCC变换的改进特征向量自动语音识别系统

2016 5th International Conference on Multimedia Computing and Systems (ICMCS) Pub Date : 2016-09-01 DOI:10.1109/ICMCS.2016.7905523

O. Lachhab, El Hassan Ibn El Haj

{"title":"基于n -1高斯MFCC变换的改进特征向量自动语音识别系统","authors":"O. Lachhab, El Hassan Ibn El Haj","doi":"10.1109/ICMCS.2016.7905523","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a novel vector transformation projecting the feature vectors in a new space, characterized by good discriminant properties, while reducing drastically the number of parameters used in the ASR systems. We call this method “N-to-1 Gaussian MFCC transformation”. It uses the HMM acoustic parameters obtained by N and 1 Gaussian in the training process in order to calculate the transformed vectors in the new projection space. Our transformation technique permits an important reduction of the number of Gaussians (in the GMM modeling of the emission probability of each state) while improving the performances of ASR systems. Our experimental results using both TIMIT and FPSD corpus demonstrate that the proposed feature transformation, improves the phone recognition accuracy when compared with classical methods using conventional cepstral feature vectors in the context of using HMMs with a number of Gaussians less than 16 by state.","PeriodicalId":345854,"journal":{"name":"2016 5th International Conference on Multimedia Computing and Systems (ICMCS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improved feature vectors using N-to-1 Gaussian MFCC transformation for automatic speech recognition system\",\"authors\":\"O. Lachhab, El Hassan Ibn El Haj\",\"doi\":\"10.1109/ICMCS.2016.7905523\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a novel vector transformation projecting the feature vectors in a new space, characterized by good discriminant properties, while reducing drastically the number of parameters used in the ASR systems. We call this method “N-to-1 Gaussian MFCC transformation”. It uses the HMM acoustic parameters obtained by N and 1 Gaussian in the training process in order to calculate the transformed vectors in the new projection space. Our transformation technique permits an important reduction of the number of Gaussians (in the GMM modeling of the emission probability of each state) while improving the performances of ASR systems. Our experimental results using both TIMIT and FPSD corpus demonstrate that the proposed feature transformation, improves the phone recognition accuracy when compared with classical methods using conventional cepstral feature vectors in the context of using HMMs with a number of Gaussians less than 16 by state.\",\"PeriodicalId\":345854,\"journal\":{\"name\":\"2016 5th International Conference on Multimedia Computing and Systems (ICMCS)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 5th International Conference on Multimedia Computing and Systems (ICMCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMCS.2016.7905523\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 5th International Conference on Multimedia Computing and Systems (ICMCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMCS.2016.7905523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们提出了一种新的向量变换，将特征向量投影到一个新的空间中，该空间具有良好的判别特性，同时大大减少了ASR系统中使用的参数数量。我们称这种方法为“N-to-1高斯MFCC变换”。在训练过程中使用N和1高斯得到的HMM声学参数来计算新的投影空间中的变换向量。我们的变换技术允许在提高ASR系统性能的同时显著减少高斯数(在每个状态发射概率的GMM建模中)。使用TIMIT和FPSD语料库的实验结果表明，与使用传统倒谱特征向量的经典方法相比，在使用状态高斯数小于16的hmm的情况下，所提出的特征变换提高了手机识别的精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improved feature vectors using N-to-1 Gaussian MFCC transformation for automatic speech recognition system

In this paper, we propose a novel vector transformation projecting the feature vectors in a new space, characterized by good discriminant properties, while reducing drastically the number of parameters used in the ASR systems. We call this method “N-to-1 Gaussian MFCC transformation”. It uses the HMM acoustic parameters obtained by N and 1 Gaussian in the training process in order to calculate the transformed vectors in the new projection space. Our transformation technique permits an important reduction of the number of Gaussians (in the GMM modeling of the emission probability of each state) while improving the performances of ASR systems. Our experimental results using both TIMIT and FPSD corpus demonstrate that the proposed feature transformation, improves the phone recognition accuracy when compared with classical methods using conventional cepstral feature vectors in the context of using HMMs with a number of Gaussians less than 16 by state.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 5th International Conference on Multimedia Computing and Systems (ICMCS)

自引率

0.00%

发文量