Emotion recognition using continuous density HMM

2015 International Conference on Communications and Signal Processing (ICCSP) Pub Date : 2015-04-02 DOI:10.1109/ICCSP.2015.7322630

R. Anila, A. Revathy

{"title":"Emotion recognition using continuous density HMM","authors":"R. Anila, A. Revathy","doi":"10.1109/ICCSP.2015.7322630","DOIUrl":null,"url":null,"abstract":"This paper proposes a technique to recognize the emotion present in the speech signal using Continuous Density HMM.The Perceptual Linear Predictive Cepstrum (PLPC) and Mel Frequency Perceptual Linear Predictive Cepstrum (MFPLPC) features are considered in our work and they are extracted from the speech and training models are created using Continuous Density HMM. For the Speaker Independent classification technique, pre- processing is done on test speeches and features are extracted. The log likelihood values are computed for all the models and the maximum value corresponds to the classification of particular emotion. The better recognition rate for emotions is obtained when the correct classification is counted for either one of the two features such as PLPC and MFPLPC. The emotions such as anger, fear and happy are grouped together as hard emotions and the emotions such as sad, boredom, disgust and neutral are grouped together as soft emotions. To classify a test speech corresponding to either hard emotion or soft emotion, the short-time energy value is computed for each emotional speech. The threshold value has been set to do this group classification correctly. All the test speeches are correctly classified as either hard or soft emotion. The sub-classification within a group is also done for a test speech. Accuracy is comparatively better for group models than that of the individual models. The results are obtained for the well-known and freely available database BERLIN using data of seven emotional states.","PeriodicalId":174192,"journal":{"name":"2015 International Conference on Communications and Signal Processing (ICCSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Communications and Signal Processing (ICCSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSP.2015.7322630","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

This paper proposes a technique to recognize the emotion present in the speech signal using Continuous Density HMM.The Perceptual Linear Predictive Cepstrum (PLPC) and Mel Frequency Perceptual Linear Predictive Cepstrum (MFPLPC) features are considered in our work and they are extracted from the speech and training models are created using Continuous Density HMM. For the Speaker Independent classification technique, pre- processing is done on test speeches and features are extracted. The log likelihood values are computed for all the models and the maximum value corresponds to the classification of particular emotion. The better recognition rate for emotions is obtained when the correct classification is counted for either one of the two features such as PLPC and MFPLPC. The emotions such as anger, fear and happy are grouped together as hard emotions and the emotions such as sad, boredom, disgust and neutral are grouped together as soft emotions. To classify a test speech corresponding to either hard emotion or soft emotion, the short-time energy value is computed for each emotional speech. The threshold value has been set to do this group classification correctly. All the test speeches are correctly classified as either hard or soft emotion. The sub-classification within a group is also done for a test speech. Accuracy is comparatively better for group models than that of the individual models. The results are obtained for the well-known and freely available database BERLIN using data of seven emotional states.

查看原文本刊更多论文

基于连续密度HMM的情绪识别

本文提出了一种基于连续密度HMM的语音信号情感识别技术。我们的工作考虑了感知线性预测倒谱(PLPC)和Mel频率感知线性预测倒谱(MFPLPC)特征，它们从语音中提取，并使用连续密度HMM创建训练模型。独立说话人分类技术对测试语音进行预处理，提取特征。计算所有模型的对数似然值，最大值对应于特定情绪的分类。当对PLPC和MFPLPC这两种特征中的任何一种进行正确分类时，对情绪的识别率更高。愤怒、恐惧和快乐等情绪被归为硬情绪，悲伤、无聊、厌恶和中性等情绪被归为软情绪。为了对对应于硬情绪或软情绪的测试语音进行分类，计算每个情绪语音的短时能量值。已设置阈值以正确执行此组分类。所有的测试演讲都被正确地分为硬情感和软情感。分组内的子分类也用于测试演讲。相对而言，群体模型的准确率要高于个体模型。结果是在著名且免费的数据库BERLIN中使用七种情绪状态的数据得到的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 International Conference on Communications and Signal Processing (ICCSP)

自引率

0.00%

发文量