{"title":"Deep emotion recognition using prosodic and spectral feature extraction and classification based on cross validation and bootstrap","authors":"Ayush Sharma, David V. Anderson","doi":"10.1109/DSP-SPE.2015.7369591","DOIUrl":null,"url":null,"abstract":"Despite the existence of a robust model to identify basic emotions, the ability to classify a large group of emotions with reliability is yet to be developed. Hence, objective of this paper is to develop an efficient technique to identify emotions with an accuracy comparable to humans. The array of emotions addressed in this paper go far beyond what are present on the circumflex diagram. Due to the nature of correlation and ambiguity present in emotions, both prosodic and spectral features of speech are considered during the feature extraction. Feature selection algorithms are applied to work on a subset of relevant features. Owing to the low dimensionality of the feature space, several cross validation methods are employed in combination with different classifiers and their performances are compared. In addition to cross validation, the bootstrap error estimate is also calculated and a combination of both is used as an overall estimate of the classification accuracy of the model.","PeriodicalId":91992,"journal":{"name":"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)","volume":"32 1","pages":"421-425"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Signal Processing and Signal Processing Education Workshop (SP/SPE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSP-SPE.2015.7369591","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Despite the existence of a robust model to identify basic emotions, the ability to classify a large group of emotions with reliability is yet to be developed. Hence, objective of this paper is to develop an efficient technique to identify emotions with an accuracy comparable to humans. The array of emotions addressed in this paper go far beyond what are present on the circumflex diagram. Due to the nature of correlation and ambiguity present in emotions, both prosodic and spectral features of speech are considered during the feature extraction. Feature selection algorithms are applied to work on a subset of relevant features. Owing to the low dimensionality of the feature space, several cross validation methods are employed in combination with different classifiers and their performances are compared. In addition to cross validation, the bootstrap error estimate is also calculated and a combination of both is used as an overall estimate of the classification accuracy of the model.