{"title":"基于n -1高斯MFCC变换的改进特征向量自动语音识别系统","authors":"O. Lachhab, El Hassan Ibn El Haj","doi":"10.1109/ICMCS.2016.7905523","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a novel vector transformation projecting the feature vectors in a new space, characterized by good discriminant properties, while reducing drastically the number of parameters used in the ASR systems. We call this method “N-to-1 Gaussian MFCC transformation”. It uses the HMM acoustic parameters obtained by N and 1 Gaussian in the training process in order to calculate the transformed vectors in the new projection space. Our transformation technique permits an important reduction of the number of Gaussians (in the GMM modeling of the emission probability of each state) while improving the performances of ASR systems. Our experimental results using both TIMIT and FPSD corpus demonstrate that the proposed feature transformation, improves the phone recognition accuracy when compared with classical methods using conventional cepstral feature vectors in the context of using HMMs with a number of Gaussians less than 16 by state.","PeriodicalId":345854,"journal":{"name":"2016 5th International Conference on Multimedia Computing and Systems (ICMCS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improved feature vectors using N-to-1 Gaussian MFCC transformation for automatic speech recognition system\",\"authors\":\"O. Lachhab, El Hassan Ibn El Haj\",\"doi\":\"10.1109/ICMCS.2016.7905523\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a novel vector transformation projecting the feature vectors in a new space, characterized by good discriminant properties, while reducing drastically the number of parameters used in the ASR systems. We call this method “N-to-1 Gaussian MFCC transformation”. It uses the HMM acoustic parameters obtained by N and 1 Gaussian in the training process in order to calculate the transformed vectors in the new projection space. Our transformation technique permits an important reduction of the number of Gaussians (in the GMM modeling of the emission probability of each state) while improving the performances of ASR systems. Our experimental results using both TIMIT and FPSD corpus demonstrate that the proposed feature transformation, improves the phone recognition accuracy when compared with classical methods using conventional cepstral feature vectors in the context of using HMMs with a number of Gaussians less than 16 by state.\",\"PeriodicalId\":345854,\"journal\":{\"name\":\"2016 5th International Conference on Multimedia Computing and Systems (ICMCS)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 5th International Conference on Multimedia Computing and Systems (ICMCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMCS.2016.7905523\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 5th International Conference on Multimedia Computing and Systems (ICMCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMCS.2016.7905523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improved feature vectors using N-to-1 Gaussian MFCC transformation for automatic speech recognition system
In this paper, we propose a novel vector transformation projecting the feature vectors in a new space, characterized by good discriminant properties, while reducing drastically the number of parameters used in the ASR systems. We call this method “N-to-1 Gaussian MFCC transformation”. It uses the HMM acoustic parameters obtained by N and 1 Gaussian in the training process in order to calculate the transformed vectors in the new projection space. Our transformation technique permits an important reduction of the number of Gaussians (in the GMM modeling of the emission probability of each state) while improving the performances of ASR systems. Our experimental results using both TIMIT and FPSD corpus demonstrate that the proposed feature transformation, improves the phone recognition accuracy when compared with classical methods using conventional cepstral feature vectors in the context of using HMMs with a number of Gaussians less than 16 by state.