Group-Level Emotion Recognition using Deep Models with A Four-stream Hybrid Network

Ahmed-Shehab Khan, Zhiyuan Li, Jie Cai, Zibo Meng, James O'Reilly, Yan Tong
{"title":"Group-Level Emotion Recognition using Deep Models with A Four-stream Hybrid Network","authors":"Ahmed-Shehab Khan, Zhiyuan Li, Jie Cai, Zibo Meng, James O'Reilly, Yan Tong","doi":"10.1145/3242969.3264987","DOIUrl":null,"url":null,"abstract":"Group-level Emotion Recognition (GER) in the wild is a challenging task gaining lots of attention. Most recent works utilized two channels of information, a channel involving only faces and a channel containing the whole image, to solve this problem. However, modeling the relationship between faces and scene in a global image remains challenging. In this paper, we proposed a novel face-location aware global network, capturing the face location information in the form of an attention heatmap to better model such relationships. We also proposed a multi-scale face network to infer the group-level emotion from individual faces, which explicitly handles high variance in image and face size, as images in the wild are collected from different sources with different resolutions. In addition, a global blurred stream was developed to explicitly learn and extract the scene-only features. Finally, we proposed a four-stream hybrid network, consisting of the face-location aware global stream, the multi-scale face stream, a global blurred stream, and a global stream, to address the GER task, and showed the effectiveness of our method in GER sub-challenge, a part of the six Emotion Recognition in the Wild (EmotiW 2018) [10] Challenge. The proposed method achieved 65.59% and 78.39% accuracy on the testing and validation sets, respectively, and is ranked the third place on the leaderboard.","PeriodicalId":308751,"journal":{"name":"Proceedings of the 20th ACM International Conference on Multimodal Interaction","volume":"101 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th ACM International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3242969.3264987","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22

Abstract

Group-level Emotion Recognition (GER) in the wild is a challenging task gaining lots of attention. Most recent works utilized two channels of information, a channel involving only faces and a channel containing the whole image, to solve this problem. However, modeling the relationship between faces and scene in a global image remains challenging. In this paper, we proposed a novel face-location aware global network, capturing the face location information in the form of an attention heatmap to better model such relationships. We also proposed a multi-scale face network to infer the group-level emotion from individual faces, which explicitly handles high variance in image and face size, as images in the wild are collected from different sources with different resolutions. In addition, a global blurred stream was developed to explicitly learn and extract the scene-only features. Finally, we proposed a four-stream hybrid network, consisting of the face-location aware global stream, the multi-scale face stream, a global blurred stream, and a global stream, to address the GER task, and showed the effectiveness of our method in GER sub-challenge, a part of the six Emotion Recognition in the Wild (EmotiW 2018) [10] Challenge. The proposed method achieved 65.59% and 78.39% accuracy on the testing and validation sets, respectively, and is ranked the third place on the leaderboard.
基于四流混合网络的深度模型的群体级情感识别
野外群体情感识别是一项具有挑战性的任务,引起了人们的广泛关注。最近的作品利用了两个信息通道,一个只涉及人脸的通道和一个包含整个图像的通道来解决这个问题。然而,在全局图像中建模人脸和场景之间的关系仍然具有挑战性。在本文中,我们提出了一种新的人脸位置感知全球网络,以注意力热图的形式捕获人脸位置信息,以更好地建模这种关系。我们还提出了一个多尺度面部网络来从个体面部推断群体层面的情绪,该网络明确地处理了图像和面部大小的高方差,因为野外图像是从不同分辨率的不同来源收集的。此外,开发了一个全局模糊流来明确地学习和提取场景特征。最后,我们提出了一个四流混合网络,包括面部位置感知的全局流、多尺度面部流、全局模糊流和全局流,来解决GER任务,并展示了我们的方法在GER子挑战中的有效性,GER子挑战是六种情绪识别的一部分(EmotiW 2018)[10]挑战。该方法在测试集和验证集上的准确率分别达到65.59%和78.39%,在排行榜上排名第三。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信