Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction Pub Date : 2015-11-09 DOI:10.1145/2818346.2830587

Gil Levi, Tal Hassner

{"title":"Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns","authors":"Gil Levi, Tal Hassner","doi":"10.1145/2818346.2830587","DOIUrl":null,"url":null,"abstract":"We present a novel method for classifying emotions from static facial images. Our approach leverages on the recent success of Convolutional Neural Networks (CNN) on face recognition problems. Unlike the settings often assumed there, far less labeled data is typically available for training emotion classification systems. Our method is therefore designed with the goal of simplifying the problem domain by removing confounding factors from the input images, with an emphasis on image illumination variations. This, in an effort to reduce the amount of data required to effectively train deep CNN models. To this end, we propose novel transformations of image intensities to 3D spaces, designed to be invariant to monotonic photometric transformations. These are applied to CASIA Webface images which are then used to train an ensemble of multiple architecture CNNs on multiple representations. Each model is then fine-tuned with limited emotion labeled training data to obtain final classification models. Our method was tested on the Emotion Recognition in the Wild Challenge (EmotiW 2015), Static Facial Expression Recognition sub-challenge (SFEW) and shown to provide a substantial, 15.36% improvement over baseline results (40% gain in performance).","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"303","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2818346.2830587","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 303

Abstract

We present a novel method for classifying emotions from static facial images. Our approach leverages on the recent success of Convolutional Neural Networks (CNN) on face recognition problems. Unlike the settings often assumed there, far less labeled data is typically available for training emotion classification systems. Our method is therefore designed with the goal of simplifying the problem domain by removing confounding factors from the input images, with an emphasis on image illumination variations. This, in an effort to reduce the amount of data required to effectively train deep CNN models. To this end, we propose novel transformations of image intensities to 3D spaces, designed to be invariant to monotonic photometric transformations. These are applied to CASIA Webface images which are then used to train an ensemble of multiple architecture CNNs on multiple representations. Each model is then fine-tuned with limited emotion labeled training data to obtain final classification models. Our method was tested on the Emotion Recognition in the Wild Challenge (EmotiW 2015), Static Facial Expression Recognition sub-challenge (SFEW) and shown to provide a substantial, 15.36% improvement over baseline results (40% gain in performance).

查看原文本刊更多论文

基于卷积神经网络和映射二进制模式的野外情绪识别

我们提出了一种从静态面部图像中分类情绪的新方法。我们的方法利用了卷积神经网络(CNN)最近在人脸识别问题上的成功。与通常假设的设置不同，用于训练情绪分类系统的标记数据通常要少得多。因此，我们的方法旨在通过消除输入图像中的混淆因素来简化问题域，并强调图像照明的变化。这是为了减少有效训练深度CNN模型所需的数据量。为此，我们提出了新的图像强度到三维空间的变换，设计成对单调光度变换不变。这些应用于CASIA Webface图像，然后用于在多个表示上训练多个架构cnn的集合。然后用有限的情感标记训练数据对每个模型进行微调，以获得最终的分类模型。我们的方法在野生挑战中的情绪识别(EmotiW 2015)、静态面部表情识别子挑战(SFEW)上进行了测试，结果显示，与基线结果相比，我们的方法提供了15.36%的显著改进(性能提高40%)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction

自引率

0.00%

发文量