Raufani Aminullah A., Muhammad Nasrun, C. Setianingsih
{"title":"Human Emotion Detection with Speech Recognition Using Mel-frequency Cepstral Coefficient and Support Vector Machine","authors":"Raufani Aminullah A., Muhammad Nasrun, C. Setianingsih","doi":"10.1109/AIMS52415.2021.9466077","DOIUrl":null,"url":null,"abstract":"In the era of globalization, the introduction of emotions into research topics is currently used in specific fields, especially in computer-human interactions. Often, we recognize someone's emotions only through facial expressions. Another way that can be done is that we can recognize someone's emotions through sound signals. In this study, a human emotion detection system using sound signals was used with the feature extraction method, namely the Mel-Frequency Cepstral Coefficient (MFCC). This method was chosen because MFCC approaches the human auditory system's response more closely than other systems. Support Vector Machine (SVM) is the newest data classification method developed by Chervonenkis and Vapnik in the 1990s. SVM is supervised machine learning that is often used to classify human speech recognition in many studies. In several previous studies, the commonly used kernel from SVM Multi-Class was the RBF kernel. This is because SVM uses the Radial Basis Function (RBF) kernel to have better accuracy. The highest accuracy ratio of this study was 72.5%, with a frame size of 0.001 seconds, 80 filter banks, [0.3 - 0.7] gamma, and 1.0 C values.","PeriodicalId":299121,"journal":{"name":"2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIMS52415.2021.9466077","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In the era of globalization, the introduction of emotions into research topics is currently used in specific fields, especially in computer-human interactions. Often, we recognize someone's emotions only through facial expressions. Another way that can be done is that we can recognize someone's emotions through sound signals. In this study, a human emotion detection system using sound signals was used with the feature extraction method, namely the Mel-Frequency Cepstral Coefficient (MFCC). This method was chosen because MFCC approaches the human auditory system's response more closely than other systems. Support Vector Machine (SVM) is the newest data classification method developed by Chervonenkis and Vapnik in the 1990s. SVM is supervised machine learning that is often used to classify human speech recognition in many studies. In several previous studies, the commonly used kernel from SVM Multi-Class was the RBF kernel. This is because SVM uses the Radial Basis Function (RBF) kernel to have better accuracy. The highest accuracy ratio of this study was 72.5%, with a frame size of 0.001 seconds, 80 filter banks, [0.3 - 0.7] gamma, and 1.0 C values.