基于PCA的印尼语语音识别中MFCC特征提取精度的提高

2018 International Conference on Information and Communications Technology (ICOIACT) Pub Date : 2018-03-06 DOI:10.1109/ICOIACT.2018.8350748

A. Winursito, Risanuri Hidayat, Agus Bejo

{"title":"基于PCA的印尼语语音识别中MFCC特征提取精度的提高","authors":"A. Winursito, Risanuri Hidayat, Agus Bejo","doi":"10.1109/ICOIACT.2018.8350748","DOIUrl":null,"url":null,"abstract":"In the pattern recognition system, there are many methods used. For speech recognition system, Mel Frequency Cepstral Coefficients (MFCC) becomes a popular feature extraction method but it has various weaknesses especially about the accuracy level and the high of result feature dimension of the extraction method. This paper presents the combination of MFCC feature extraction method with Principal Component Analysis (PCA) to improve the accuracy in Indonesian speech recognition system. By combining MFCC and PCA, it was expected to increase the accuracy system and reduce the feature data dimension. The result of MFCC data features extraction added with delta coefficients formed matrix data that later would be reduced using PCA. PCA method in the process of data reduction was designed to be two versions. Then the result of PCA reduction data was processed to the classification process using K-Nearest Neighbour (KNN) method. Composing the data was formed from 140 speech data that were recorded from 28 speakers. The research findings showed that adding PCA method version 1 could reduce the feature dimension from 26 to 12 by the same accuracy of speech recognition with the conventional MFCC method without PCA, that is 86.43%. Whereas PCA method version 2 could increase the accuracy of speech recognition from the conventional MFCC method without PCA in increasing from 86.43% to 89.29% and decreasing of the data dimension from 26 to 10 feature dimensions.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"1 1","pages":"379-383"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"63","resultStr":"{\"title\":\"Improvement of MFCC feature extraction accuracy using PCA in Indonesian speech recognition\",\"authors\":\"A. Winursito, Risanuri Hidayat, Agus Bejo\",\"doi\":\"10.1109/ICOIACT.2018.8350748\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the pattern recognition system, there are many methods used. For speech recognition system, Mel Frequency Cepstral Coefficients (MFCC) becomes a popular feature extraction method but it has various weaknesses especially about the accuracy level and the high of result feature dimension of the extraction method. This paper presents the combination of MFCC feature extraction method with Principal Component Analysis (PCA) to improve the accuracy in Indonesian speech recognition system. By combining MFCC and PCA, it was expected to increase the accuracy system and reduce the feature data dimension. The result of MFCC data features extraction added with delta coefficients formed matrix data that later would be reduced using PCA. PCA method in the process of data reduction was designed to be two versions. Then the result of PCA reduction data was processed to the classification process using K-Nearest Neighbour (KNN) method. Composing the data was formed from 140 speech data that were recorded from 28 speakers. The research findings showed that adding PCA method version 1 could reduce the feature dimension from 26 to 12 by the same accuracy of speech recognition with the conventional MFCC method without PCA, that is 86.43%. Whereas PCA method version 2 could increase the accuracy of speech recognition from the conventional MFCC method without PCA in increasing from 86.43% to 89.29% and decreasing of the data dimension from 26 to 10 feature dimensions.\",\"PeriodicalId\":6660,\"journal\":{\"name\":\"2018 International Conference on Information and Communications Technology (ICOIACT)\",\"volume\":\"1 1\",\"pages\":\"379-383\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"63\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 International Conference on Information and Communications Technology (ICOIACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOIACT.2018.8350748\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Information and Communications Technology (ICOIACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOIACT.2018.8350748","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 63

摘要

在模式识别系统中，有许多方法被使用。在语音识别系统中，低频倒谱系数(MFCC)是一种常用的特征提取方法，但该方法存在诸多缺点，特别是提取方法的准确率和结果特征维数较高。本文提出将MFCC特征提取方法与主成分分析(PCA)相结合，提高印尼语语音识别系统的准确率。通过MFCC和PCA的结合，提高了系统的精度，降低了特征数据的维数。MFCC数据特征提取的结果加上delta系数形成矩阵数据，然后使用主成分分析法对矩阵数据进行约简。主成分分析法在数据约简过程中被设计为两个版本。然后利用k近邻(KNN)方法对主成分约简结果进行分类处理。数据由28位说话者记录的140个语音数据组成。研究结果表明，加入版本1的主成分分析方法可以将特征维数从26个降至12个，其识别准确率与不加主成分分析的传统MFCC方法相同，为86.43%。而PCA方法版本2可以将语音识别的准确率从没有PCA的传统MFCC方法的86.43%提高到89.29%，并且将数据维数从26个特征维减少到10个特征维。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improvement of MFCC feature extraction accuracy using PCA in Indonesian speech recognition

In the pattern recognition system, there are many methods used. For speech recognition system, Mel Frequency Cepstral Coefficients (MFCC) becomes a popular feature extraction method but it has various weaknesses especially about the accuracy level and the high of result feature dimension of the extraction method. This paper presents the combination of MFCC feature extraction method with Principal Component Analysis (PCA) to improve the accuracy in Indonesian speech recognition system. By combining MFCC and PCA, it was expected to increase the accuracy system and reduce the feature data dimension. The result of MFCC data features extraction added with delta coefficients formed matrix data that later would be reduced using PCA. PCA method in the process of data reduction was designed to be two versions. Then the result of PCA reduction data was processed to the classification process using K-Nearest Neighbour (KNN) method. Composing the data was formed from 140 speech data that were recorded from 28 speakers. The research findings showed that adding PCA method version 1 could reduce the feature dimension from 26 to 12 by the same accuracy of speech recognition with the conventional MFCC method without PCA, that is 86.43%. Whereas PCA method version 2 could increase the accuracy of speech recognition from the conventional MFCC method without PCA in increasing from 86.43% to 89.29% and decreasing of the data dimension from 26 to 10 feature dimensions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 International Conference on Information and Communications Technology (ICOIACT)

自引率

0.00%

发文量