言语存在时面部情绪表达的动作单元模型。

Miraj Shah, David G Cooper, Houwei Cao, Ruben C Gur, Ani Nenkova, Ragini Verma
{"title":"言语存在时面部情绪表达的动作单元模型。","authors":"Miraj Shah,&nbsp;David G Cooper,&nbsp;Houwei Cao,&nbsp;Ruben C Gur,&nbsp;Ani Nenkova,&nbsp;Ragini Verma","doi":"10.1109/ACII.2013.15","DOIUrl":null,"url":null,"abstract":"<p><p>Automatic recognition of emotion using facial expressions in the presence of speech poses a unique challenge because talking reveals clues for the affective state of the speaker but distorts the canonical expression of emotion on the face. We introduce a corpus of acted emotion expression where speech is either present (talking) or absent (silent). The corpus is uniquely suited for analysis of the interplay between the two conditions. We use a multimodal decision level fusion classifier to combine models of emotion from talking and silent faces as well as from audio to recognize five basic emotions: anger, disgust, fear, happy and sad. Our results strongly indicate that emotion prediction in the presence of speech from action unit facial features is less accurate when the person is talking. Modeling talking and silent expressions separately and fusing the two models greatly improves accuracy of prediction in the talking setting. The advantages are most pronounced when silent and talking face models are fused with predictions from audio features. In this multi-modal prediction both the combination of modalities and the separate models of talking and silent facial expression of emotion contribute to the improvement.</p>","PeriodicalId":89154,"journal":{"name":"International Conference on Affective Computing and Intelligent Interaction and workshops : [proceedings]. ACII (Conference)","volume":"2013 ","pages":"49-54"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/ACII.2013.15","citationCount":"11","resultStr":"{\"title\":\"Action Unit Models of Facial Expression of Emotion in the Presence of Speech.\",\"authors\":\"Miraj Shah,&nbsp;David G Cooper,&nbsp;Houwei Cao,&nbsp;Ruben C Gur,&nbsp;Ani Nenkova,&nbsp;Ragini Verma\",\"doi\":\"10.1109/ACII.2013.15\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Automatic recognition of emotion using facial expressions in the presence of speech poses a unique challenge because talking reveals clues for the affective state of the speaker but distorts the canonical expression of emotion on the face. We introduce a corpus of acted emotion expression where speech is either present (talking) or absent (silent). The corpus is uniquely suited for analysis of the interplay between the two conditions. We use a multimodal decision level fusion classifier to combine models of emotion from talking and silent faces as well as from audio to recognize five basic emotions: anger, disgust, fear, happy and sad. Our results strongly indicate that emotion prediction in the presence of speech from action unit facial features is less accurate when the person is talking. Modeling talking and silent expressions separately and fusing the two models greatly improves accuracy of prediction in the talking setting. The advantages are most pronounced when silent and talking face models are fused with predictions from audio features. In this multi-modal prediction both the combination of modalities and the separate models of talking and silent facial expression of emotion contribute to the improvement.</p>\",\"PeriodicalId\":89154,\"journal\":{\"name\":\"International Conference on Affective Computing and Intelligent Interaction and workshops : [proceedings]. ACII (Conference)\",\"volume\":\"2013 \",\"pages\":\"49-54\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/ACII.2013.15\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Affective Computing and Intelligent Interaction and workshops : [proceedings]. ACII (Conference)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACII.2013.15\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Affective Computing and Intelligent Interaction and workshops : [proceedings]. ACII (Conference)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACII.2013.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

摘要

在说话时使用面部表情来自动识别情绪是一项独特的挑战,因为说话会揭示说话者情感状态的线索,但会扭曲面部的典型情绪表达。我们引入了一个行为情感表达语料库,其中言语要么存在(说话),要么不存在(沉默)。语料库是唯一适合于分析这两个条件之间的相互作用。我们使用多模态决策级融合分类器,将说话和沉默的面孔以及音频的情绪模型结合起来,识别出五种基本情绪:愤怒、厌恶、恐惧、快乐和悲伤。我们的研究结果有力地表明,当一个人在说话时,在言语存在的情况下,从动作单元面部特征来预测情绪的准确性较低。分别对说话表情和沉默表情进行建模,并将两者融合,大大提高了说话情景下的预测精度。当沉默和说话的面部模型与音频特征的预测相结合时,优势最为明显。在这种多模态预测中,模态的组合以及说话和沉默面部表情的单独模型都有助于改善。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Action Unit Models of Facial Expression of Emotion in the Presence of Speech.

Automatic recognition of emotion using facial expressions in the presence of speech poses a unique challenge because talking reveals clues for the affective state of the speaker but distorts the canonical expression of emotion on the face. We introduce a corpus of acted emotion expression where speech is either present (talking) or absent (silent). The corpus is uniquely suited for analysis of the interplay between the two conditions. We use a multimodal decision level fusion classifier to combine models of emotion from talking and silent faces as well as from audio to recognize five basic emotions: anger, disgust, fear, happy and sad. Our results strongly indicate that emotion prediction in the presence of speech from action unit facial features is less accurate when the person is talking. Modeling talking and silent expressions separately and fusing the two models greatly improves accuracy of prediction in the talking setting. The advantages are most pronounced when silent and talking face models are fused with predictions from audio features. In this multi-modal prediction both the combination of modalities and the separate models of talking and silent facial expression of emotion contribute to the improvement.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信