{"title":"Feature Selection for GUMI Kernel-Based SVM in Speech Emotion Recognition","authors":"I. Trabelsi, M. Bouhlel","doi":"10.4018/IJSE.2015070104","DOIUrl":null,"url":null,"abstract":"Speech emotion recognition is the indispensable requirement for efficient human machine interaction. Most modern automatic speech emotion recognition systems use Gaussian mixture models GMM and Support Vector Machines SVM. GMM are known for their performance and scalability in the spectral modeling while SVM are known for their discriminatory power. A GMM-supervector characterizes an emotional style by the GMM parameters mean vectors, covariance matrices, and mixture weights. GMM-supervector SVM benefits from both GMM and SVM frameworks. In this paper, the GMM-UBM mean interval GUMI kernel based on the Bhattacharyya distance is successfully used. CFSSubsetEval combined with Best first algorithm and Greedy stepwise were also utilized on the supervectors space in order to select the most important features. This framework is illustrated using Mel-frequency cepstral MFCC coefficients and Perceptual Linear Prediction PLP features on two different emotional databases namely the Surrey Audio-Expressed Emotion and the Berlin Emotional speech Database.","PeriodicalId":272943,"journal":{"name":"Int. J. Synth. Emot.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Synth. Emot.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/IJSE.2015070104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
Speech emotion recognition is the indispensable requirement for efficient human machine interaction. Most modern automatic speech emotion recognition systems use Gaussian mixture models GMM and Support Vector Machines SVM. GMM are known for their performance and scalability in the spectral modeling while SVM are known for their discriminatory power. A GMM-supervector characterizes an emotional style by the GMM parameters mean vectors, covariance matrices, and mixture weights. GMM-supervector SVM benefits from both GMM and SVM frameworks. In this paper, the GMM-UBM mean interval GUMI kernel based on the Bhattacharyya distance is successfully used. CFSSubsetEval combined with Best first algorithm and Greedy stepwise were also utilized on the supervectors space in order to select the most important features. This framework is illustrated using Mel-frequency cepstral MFCC coefficients and Perceptual Linear Prediction PLP features on two different emotional databases namely the Surrey Audio-Expressed Emotion and the Berlin Emotional speech Database.