{"title":"基于连续密度HMM的情绪识别","authors":"R. Anila, A. Revathy","doi":"10.1109/ICCSP.2015.7322630","DOIUrl":null,"url":null,"abstract":"This paper proposes a technique to recognize the emotion present in the speech signal using Continuous Density HMM.The Perceptual Linear Predictive Cepstrum (PLPC) and Mel Frequency Perceptual Linear Predictive Cepstrum (MFPLPC) features are considered in our work and they are extracted from the speech and training models are created using Continuous Density HMM. For the Speaker Independent classification technique, pre- processing is done on test speeches and features are extracted. The log likelihood values are computed for all the models and the maximum value corresponds to the classification of particular emotion. The better recognition rate for emotions is obtained when the correct classification is counted for either one of the two features such as PLPC and MFPLPC. The emotions such as anger, fear and happy are grouped together as hard emotions and the emotions such as sad, boredom, disgust and neutral are grouped together as soft emotions. To classify a test speech corresponding to either hard emotion or soft emotion, the short-time energy value is computed for each emotional speech. The threshold value has been set to do this group classification correctly. All the test speeches are correctly classified as either hard or soft emotion. The sub-classification within a group is also done for a test speech. Accuracy is comparatively better for group models than that of the individual models. The results are obtained for the well-known and freely available database BERLIN using data of seven emotional states.","PeriodicalId":174192,"journal":{"name":"2015 International Conference on Communications and Signal Processing (ICCSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Emotion recognition using continuous density HMM\",\"authors\":\"R. Anila, A. Revathy\",\"doi\":\"10.1109/ICCSP.2015.7322630\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes a technique to recognize the emotion present in the speech signal using Continuous Density HMM.The Perceptual Linear Predictive Cepstrum (PLPC) and Mel Frequency Perceptual Linear Predictive Cepstrum (MFPLPC) features are considered in our work and they are extracted from the speech and training models are created using Continuous Density HMM. For the Speaker Independent classification technique, pre- processing is done on test speeches and features are extracted. The log likelihood values are computed for all the models and the maximum value corresponds to the classification of particular emotion. The better recognition rate for emotions is obtained when the correct classification is counted for either one of the two features such as PLPC and MFPLPC. The emotions such as anger, fear and happy are grouped together as hard emotions and the emotions such as sad, boredom, disgust and neutral are grouped together as soft emotions. To classify a test speech corresponding to either hard emotion or soft emotion, the short-time energy value is computed for each emotional speech. The threshold value has been set to do this group classification correctly. All the test speeches are correctly classified as either hard or soft emotion. The sub-classification within a group is also done for a test speech. Accuracy is comparatively better for group models than that of the individual models. The results are obtained for the well-known and freely available database BERLIN using data of seven emotional states.\",\"PeriodicalId\":174192,\"journal\":{\"name\":\"2015 International Conference on Communications and Signal Processing (ICCSP)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-04-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Communications and Signal Processing (ICCSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCSP.2015.7322630\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Communications and Signal Processing (ICCSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSP.2015.7322630","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This paper proposes a technique to recognize the emotion present in the speech signal using Continuous Density HMM.The Perceptual Linear Predictive Cepstrum (PLPC) and Mel Frequency Perceptual Linear Predictive Cepstrum (MFPLPC) features are considered in our work and they are extracted from the speech and training models are created using Continuous Density HMM. For the Speaker Independent classification technique, pre- processing is done on test speeches and features are extracted. The log likelihood values are computed for all the models and the maximum value corresponds to the classification of particular emotion. The better recognition rate for emotions is obtained when the correct classification is counted for either one of the two features such as PLPC and MFPLPC. The emotions such as anger, fear and happy are grouped together as hard emotions and the emotions such as sad, boredom, disgust and neutral are grouped together as soft emotions. To classify a test speech corresponding to either hard emotion or soft emotion, the short-time energy value is computed for each emotional speech. The threshold value has been set to do this group classification correctly. All the test speeches are correctly classified as either hard or soft emotion. The sub-classification within a group is also done for a test speech. Accuracy is comparatively better for group models than that of the individual models. The results are obtained for the well-known and freely available database BERLIN using data of seven emotional states.