{"title":"基于PCA的印尼语语音识别中MFCC特征提取精度的提高","authors":"A. Winursito, Risanuri Hidayat, Agus Bejo","doi":"10.1109/ICOIACT.2018.8350748","DOIUrl":null,"url":null,"abstract":"In the pattern recognition system, there are many methods used. For speech recognition system, Mel Frequency Cepstral Coefficients (MFCC) becomes a popular feature extraction method but it has various weaknesses especially about the accuracy level and the high of result feature dimension of the extraction method. This paper presents the combination of MFCC feature extraction method with Principal Component Analysis (PCA) to improve the accuracy in Indonesian speech recognition system. By combining MFCC and PCA, it was expected to increase the accuracy system and reduce the feature data dimension. The result of MFCC data features extraction added with delta coefficients formed matrix data that later would be reduced using PCA. PCA method in the process of data reduction was designed to be two versions. Then the result of PCA reduction data was processed to the classification process using K-Nearest Neighbour (KNN) method. Composing the data was formed from 140 speech data that were recorded from 28 speakers. The research findings showed that adding PCA method version 1 could reduce the feature dimension from 26 to 12 by the same accuracy of speech recognition with the conventional MFCC method without PCA, that is 86.43%. Whereas PCA method version 2 could increase the accuracy of speech recognition from the conventional MFCC method without PCA in increasing from 86.43% to 89.29% and decreasing of the data dimension from 26 to 10 feature dimensions.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"1 1","pages":"379-383"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"63","resultStr":"{\"title\":\"Improvement of MFCC feature extraction accuracy using PCA in Indonesian speech recognition\",\"authors\":\"A. Winursito, Risanuri Hidayat, Agus Bejo\",\"doi\":\"10.1109/ICOIACT.2018.8350748\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the pattern recognition system, there are many methods used. For speech recognition system, Mel Frequency Cepstral Coefficients (MFCC) becomes a popular feature extraction method but it has various weaknesses especially about the accuracy level and the high of result feature dimension of the extraction method. This paper presents the combination of MFCC feature extraction method with Principal Component Analysis (PCA) to improve the accuracy in Indonesian speech recognition system. By combining MFCC and PCA, it was expected to increase the accuracy system and reduce the feature data dimension. The result of MFCC data features extraction added with delta coefficients formed matrix data that later would be reduced using PCA. PCA method in the process of data reduction was designed to be two versions. Then the result of PCA reduction data was processed to the classification process using K-Nearest Neighbour (KNN) method. Composing the data was formed from 140 speech data that were recorded from 28 speakers. The research findings showed that adding PCA method version 1 could reduce the feature dimension from 26 to 12 by the same accuracy of speech recognition with the conventional MFCC method without PCA, that is 86.43%. Whereas PCA method version 2 could increase the accuracy of speech recognition from the conventional MFCC method without PCA in increasing from 86.43% to 89.29% and decreasing of the data dimension from 26 to 10 feature dimensions.\",\"PeriodicalId\":6660,\"journal\":{\"name\":\"2018 International Conference on Information and Communications Technology (ICOIACT)\",\"volume\":\"1 1\",\"pages\":\"379-383\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"63\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 International Conference on Information and Communications Technology (ICOIACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOIACT.2018.8350748\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Information and Communications Technology (ICOIACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOIACT.2018.8350748","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improvement of MFCC feature extraction accuracy using PCA in Indonesian speech recognition
In the pattern recognition system, there are many methods used. For speech recognition system, Mel Frequency Cepstral Coefficients (MFCC) becomes a popular feature extraction method but it has various weaknesses especially about the accuracy level and the high of result feature dimension of the extraction method. This paper presents the combination of MFCC feature extraction method with Principal Component Analysis (PCA) to improve the accuracy in Indonesian speech recognition system. By combining MFCC and PCA, it was expected to increase the accuracy system and reduce the feature data dimension. The result of MFCC data features extraction added with delta coefficients formed matrix data that later would be reduced using PCA. PCA method in the process of data reduction was designed to be two versions. Then the result of PCA reduction data was processed to the classification process using K-Nearest Neighbour (KNN) method. Composing the data was formed from 140 speech data that were recorded from 28 speakers. The research findings showed that adding PCA method version 1 could reduce the feature dimension from 26 to 12 by the same accuracy of speech recognition with the conventional MFCC method without PCA, that is 86.43%. Whereas PCA method version 2 could increase the accuracy of speech recognition from the conventional MFCC method without PCA in increasing from 86.43% to 89.29% and decreasing of the data dimension from 26 to 10 feature dimensions.