{"title":"基于平均傅立叶参数的语音情感识别新学习方案","authors":"Xingyu Chen, Li-Jiao Wu, Aihua Mao, Zhi-hui Zhan","doi":"10.1109/ICACI.2019.8778548","DOIUrl":null,"url":null,"abstract":"Recently, the research attention of emotional speech signals has been boosted in human machine interfaces due to the availability of high computation capability. Based on different feature extraction on audio data, it is possible to achieve good accuracy on speech emotion recognition, thus feature extraction plays an important role in speech emotion recognition. However, there are still dilemmas in speech emotion recognition, such as the heavy computation burden due to the high data dimension. In this paper, we propose a new learning scheme with mean Fourier parameters using the perceptual content of voice quality for speaker-independent speech emotion recognition. The dimension of the acoustic feature is greatly reduced and the computational performance is improved with big extent. Two speech databases (German emotional corpus, Interactive Emotional Dyadic Motion Capture) are used in the experiment, and the combination of different features with different classifiers are implemented in the recognition for performance comparison. The recognition results show that the proposed scheme with mean Fourier Parameters combined with the Random Forest classifier is efficient in classifying various emotional states in speech signals and is excellent than other features and classifiers.","PeriodicalId":213368,"journal":{"name":"2019 Eleventh International Conference on Advanced Computational Intelligence (ICACI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A New Learning Scheme of Emotion Recognition From Speech by Using Mean Fourier Parameters\",\"authors\":\"Xingyu Chen, Li-Jiao Wu, Aihua Mao, Zhi-hui Zhan\",\"doi\":\"10.1109/ICACI.2019.8778548\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, the research attention of emotional speech signals has been boosted in human machine interfaces due to the availability of high computation capability. Based on different feature extraction on audio data, it is possible to achieve good accuracy on speech emotion recognition, thus feature extraction plays an important role in speech emotion recognition. However, there are still dilemmas in speech emotion recognition, such as the heavy computation burden due to the high data dimension. In this paper, we propose a new learning scheme with mean Fourier parameters using the perceptual content of voice quality for speaker-independent speech emotion recognition. The dimension of the acoustic feature is greatly reduced and the computational performance is improved with big extent. Two speech databases (German emotional corpus, Interactive Emotional Dyadic Motion Capture) are used in the experiment, and the combination of different features with different classifiers are implemented in the recognition for performance comparison. The recognition results show that the proposed scheme with mean Fourier Parameters combined with the Random Forest classifier is efficient in classifying various emotional states in speech signals and is excellent than other features and classifiers.\",\"PeriodicalId\":213368,\"journal\":{\"name\":\"2019 Eleventh International Conference on Advanced Computational Intelligence (ICACI)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 Eleventh International Conference on Advanced Computational Intelligence (ICACI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICACI.2019.8778548\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Eleventh International Conference on Advanced Computational Intelligence (ICACI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACI.2019.8778548","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A New Learning Scheme of Emotion Recognition From Speech by Using Mean Fourier Parameters
Recently, the research attention of emotional speech signals has been boosted in human machine interfaces due to the availability of high computation capability. Based on different feature extraction on audio data, it is possible to achieve good accuracy on speech emotion recognition, thus feature extraction plays an important role in speech emotion recognition. However, there are still dilemmas in speech emotion recognition, such as the heavy computation burden due to the high data dimension. In this paper, we propose a new learning scheme with mean Fourier parameters using the perceptual content of voice quality for speaker-independent speech emotion recognition. The dimension of the acoustic feature is greatly reduced and the computational performance is improved with big extent. Two speech databases (German emotional corpus, Interactive Emotional Dyadic Motion Capture) are used in the experiment, and the combination of different features with different classifiers are implemented in the recognition for performance comparison. The recognition results show that the proposed scheme with mean Fourier Parameters combined with the Random Forest classifier is efficient in classifying various emotional states in speech signals and is excellent than other features and classifiers.