Audiovisual three-level fusion for continuous estimation of Russell's emotion circumplex

Enrique Sánchez-Lozano, Paula Lopez-Otero, Laura Docío Fernández, Enrique Argones-Rúa, J. Alba-Castro
{"title":"Audiovisual three-level fusion for continuous estimation of Russell's emotion circumplex","authors":"Enrique Sánchez-Lozano, Paula Lopez-Otero, Laura Docío Fernández, Enrique Argones-Rúa, J. Alba-Castro","doi":"10.1145/2512530.2512534","DOIUrl":null,"url":null,"abstract":"Predicting human emotions is catching the attention of many research areas, which demand accurate predictions in uncontrolled scenarios. Despite this attractiveness, designed systems for emotion detection are far off being as accurate as desired. Two of the typical measurements in human emotions are described in terms of the dimensions valence and arousal, which shape the Russell's circumplex where complex emotions lie. Thus, the Affect Recognition Sub-Challenge (ASC) of the third AudioVisual Emotion and Depression Challenge, AVEC'13, is focused on estimating these two dimensions. This paper presents a three-level fusion system combining single regression results from audio and visual features, in order to maximize the mean average correlation on both dimensions. Five sets of features are extracted (three for audio and two for video), and they are merged following an iterative process. Results show how this fusion outperforms the baseline method for the challenge database.","PeriodicalId":182988,"journal":{"name":"Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2512530.2512534","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

Abstract

Predicting human emotions is catching the attention of many research areas, which demand accurate predictions in uncontrolled scenarios. Despite this attractiveness, designed systems for emotion detection are far off being as accurate as desired. Two of the typical measurements in human emotions are described in terms of the dimensions valence and arousal, which shape the Russell's circumplex where complex emotions lie. Thus, the Affect Recognition Sub-Challenge (ASC) of the third AudioVisual Emotion and Depression Challenge, AVEC'13, is focused on estimating these two dimensions. This paper presents a three-level fusion system combining single regression results from audio and visual features, in order to maximize the mean average correlation on both dimensions. Five sets of features are extracted (three for audio and two for video), and they are merged following an iterative process. Results show how this fusion outperforms the baseline method for the challenge database.
基于视听三层次融合的Russell情绪圈连续估计
预测人类情绪正引起许多研究领域的关注,这些领域需要在不受控制的情况下做出准确的预测。尽管有这种吸引力,设计的情绪检测系统远没有达到预期的准确度。人类情绪的两种典型测量方法是用效价和觉醒的维度来描述的,这两个维度形成了复杂情绪所在的罗素圆环。因此,第三个视听情绪与抑郁挑战(AVEC'13)的情感识别子挑战(ASC)侧重于估计这两个维度。本文提出了一种将音频和视觉特征的单一回归结果结合起来的三级融合系统,以最大限度地提高两个维度上的平均相关性。提取五组特征(三组用于音频,两组用于视频),并按照迭代过程将它们合并。结果表明,这种融合方法优于挑战数据库的基线方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信