Group-Level Emotion Recognition Using Hybrid Deep Models Based on Faces, Scenes, Skeletons and Visual Attentions

Proceedings of the 20th ACM International Conference on Multimodal Interaction Pub Date : 2018-10-02 DOI:10.1145/3242969.3264990

Xin Guo, Bin Zhu, Luisa F. Polanía, C. Boncelet, K. Barner

引用次数: 30

Abstract

This paper presents a hybrid deep learning network submitted to the 6th Emotion Recognition in the Wild (EmotiW 2018) Grand Challenge [9], in the category of group-level emotion recognition. Advanced deep learning models trained individually on faces, scenes, skeletons and salient regions using visual attention mechanisms are fused to classify the emotion of a group of people in an image as positive, neutral or negative. Experimental results show that the proposed hybrid network achieves 78.98% and 68.08% classification accuracy on the validation and testing sets, respectively. These results outperform the baseline of 64% and 61%, and achieved the first place in the challenge.

查看原文本刊更多论文

基于人脸、场景、骨架和视觉注意的混合深度模型的群体级情感识别

本文提出了一个混合深度学习网络，提交给第六届野外情绪识别(EmotiW 2018)大挑战[9]，属于群体级情绪识别类别。先进的深度学习模型使用视觉注意机制对人脸、场景、骨骼和突出区域进行单独训练，将图像中一群人的情绪分为积极、中性或消极。实验结果表明，该混合网络在验证集和测试集上的分类准确率分别达到78.98%和68.08%。这些结果超过了64%和61%的基线，并在挑战中获得了第一名。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 20th ACM International Conference on Multimodal Interaction

自引率

0.00%

发文量