{"title":"使用机器学习算法的语音情感分类","authors":"S. Casale, A. Russo, G. Scebba, S. Serrano","doi":"10.1109/ICSC.2008.43","DOIUrl":null,"url":null,"abstract":"The recognition of emotional states is a relatively new technique in the field of machine learning. The paper presents the study and the performance results of a system for emotion classification using the architecture of a distributed speech recognition system (DSR). The features used were extracted by the front-end ETSI Aurora eXtended of a mobile terminal in compliance with the ETSI ES 202-211 V1.1.1 standard. On the basis of the time trend of these parameters, over 3800 statistical parameters were extracted to characterize semantic units of varying length (sentences and words). Using the WEKA (Waikato Environment for Knowledge Analysis) software the most significant parameters for the classification of emotional states were selected and the results of various classification techniques were analysed. The results, obtained using both the Berlin Database of Emotional Speech (EMO-DB) and the Speech Under Simulated and Actual Stress (SUSAS) corpus, showed that the best performance is achieved using a support vector machine (SVM) trained with the sequential minimal optimization (SMO) algorithm, after normalizing and discretizing the input statistical parameters.","PeriodicalId":102805,"journal":{"name":"2008 IEEE International Conference on Semantic Computing","volume":"201 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"84","resultStr":"{\"title\":\"Speech Emotion Classification Using Machine Learning Algorithms\",\"authors\":\"S. Casale, A. Russo, G. Scebba, S. Serrano\",\"doi\":\"10.1109/ICSC.2008.43\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The recognition of emotional states is a relatively new technique in the field of machine learning. The paper presents the study and the performance results of a system for emotion classification using the architecture of a distributed speech recognition system (DSR). The features used were extracted by the front-end ETSI Aurora eXtended of a mobile terminal in compliance with the ETSI ES 202-211 V1.1.1 standard. On the basis of the time trend of these parameters, over 3800 statistical parameters were extracted to characterize semantic units of varying length (sentences and words). Using the WEKA (Waikato Environment for Knowledge Analysis) software the most significant parameters for the classification of emotional states were selected and the results of various classification techniques were analysed. The results, obtained using both the Berlin Database of Emotional Speech (EMO-DB) and the Speech Under Simulated and Actual Stress (SUSAS) corpus, showed that the best performance is achieved using a support vector machine (SVM) trained with the sequential minimal optimization (SMO) algorithm, after normalizing and discretizing the input statistical parameters.\",\"PeriodicalId\":102805,\"journal\":{\"name\":\"2008 IEEE International Conference on Semantic Computing\",\"volume\":\"201 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-08-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"84\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE International Conference on Semantic Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSC.2008.43\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Conference on Semantic Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSC.2008.43","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 84
摘要
在机器学习领域,情绪状态识别是一项相对较新的技术。本文介绍了一种基于分布式语音识别系统(DSR)架构的情感分类系统的研究和性能结果。所使用的特征由移动终端的前端ETSI Aurora eXtended提取,符合ETSI ES 202-211 V1.1.1标准。根据这些参数的时间趋势,提取了3800多个统计参数来表征不同长度的语义单位(句子和单词)。使用WEKA (Waikato Environment for Knowledge Analysis)软件选择最重要的情绪状态分类参数,并对各种分类技术的结果进行分析。使用柏林情绪语音数据库(EMO-DB)和模拟和实际压力下的语音(SUSAS)语料库获得的结果表明,在对输入统计参数进行归一化和离散化后,使用序列最小优化(SMO)算法训练的支持向量机(SVM)获得了最佳性能。
Speech Emotion Classification Using Machine Learning Algorithms
The recognition of emotional states is a relatively new technique in the field of machine learning. The paper presents the study and the performance results of a system for emotion classification using the architecture of a distributed speech recognition system (DSR). The features used were extracted by the front-end ETSI Aurora eXtended of a mobile terminal in compliance with the ETSI ES 202-211 V1.1.1 standard. On the basis of the time trend of these parameters, over 3800 statistical parameters were extracted to characterize semantic units of varying length (sentences and words). Using the WEKA (Waikato Environment for Knowledge Analysis) software the most significant parameters for the classification of emotional states were selected and the results of various classification techniques were analysed. The results, obtained using both the Berlin Database of Emotional Speech (EMO-DB) and the Speech Under Simulated and Actual Stress (SUSAS) corpus, showed that the best performance is achieved using a support vector machine (SVM) trained with the sequential minimal optimization (SMO) algorithm, after normalizing and discretizing the input statistical parameters.