Classifying emotions in human-machine spoken dialogs

C. Lee, Shrikanth S. Narayanan, R. Pieraccini
{"title":"Classifying emotions in human-machine spoken dialogs","authors":"C. Lee, Shrikanth S. Narayanan, R. Pieraccini","doi":"10.1109/ICME.2002.1035887","DOIUrl":null,"url":null,"abstract":"This paper reports on the comparison between various acoustic feature sets and classification algorithms for classifying spoken utterances based on the emotional state of the speaker. The data set used for the analysis comes from a corpus of human-machine dialogs obtained from a commercial application. Emotion recognition is posed as a pattern recognition problem. We used three different techniques - linear discriminant classifier (LDC), k-nearest neighborhood (k-NN) classifier, and support vector machine classifier (SVC) -for classifying utterances into 2 emotion classes: negative and non-negative. In this study, two feature sets were used; the base feature set obtained from the utterance-level statistics of the pitch and energy of the speech, and the feature set analyzed by principal component analysis (PCA). PCA showed a performance comparable to the base feature sets. Overall, the LDC achieved the best performance with error rates of 27.54% on female data and 25.46% on males with the base feature set. The SVC, however, showed a better performance in the problem of data sparsity.","PeriodicalId":90694,"journal":{"name":"Proceedings. IEEE International Conference on Multimedia and Expo","volume":"3 1","pages":"737-740 vol.1"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"75","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Multimedia and Expo","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2002.1035887","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 75

Abstract

This paper reports on the comparison between various acoustic feature sets and classification algorithms for classifying spoken utterances based on the emotional state of the speaker. The data set used for the analysis comes from a corpus of human-machine dialogs obtained from a commercial application. Emotion recognition is posed as a pattern recognition problem. We used three different techniques - linear discriminant classifier (LDC), k-nearest neighborhood (k-NN) classifier, and support vector machine classifier (SVC) -for classifying utterances into 2 emotion classes: negative and non-negative. In this study, two feature sets were used; the base feature set obtained from the utterance-level statistics of the pitch and energy of the speech, and the feature set analyzed by principal component analysis (PCA). PCA showed a performance comparable to the base feature sets. Overall, the LDC achieved the best performance with error rates of 27.54% on female data and 25.46% on males with the base feature set. The SVC, however, showed a better performance in the problem of data sparsity.
对人机对话中的情绪进行分类
本文报道了基于说话人情绪状态对语音进行分类的各种声学特征集和分类算法的比较。用于分析的数据集来自从商业应用程序获得的人机对话语料库。情绪识别是一个模式识别问题。我们使用了三种不同的技术——线性判别分类器(LDC)、k近邻分类器(k-NN)和支持向量机分类器(SVC)——将话语分为两类:消极和非消极。在本研究中,使用了两个特征集;从语音的音高和能量的话语级统计得到基本特征集,并通过主成分分析(PCA)对特征集进行分析。主成分分析显示了与基本特征集相当的性能。总体而言,LDC在基本特征集上对女性数据的错误率为27.54%,对男性数据的错误率为25.46%,达到了最佳性能。然而,SVC在数据稀疏性问题上表现出更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信