LUCFER: A Large-Scale Context-Sensitive Image Dataset for Deep Learning of Visual Emotions

2019 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2019-01-01 DOI:10.1109/WACV.2019.00180

Pooyan Balouchian, M. Safaei, H. Foroosh

{"title":"LUCFER: A Large-Scale Context-Sensitive Image Dataset for Deep Learning of Visual Emotions","authors":"Pooyan Balouchian, M. Safaei, H. Foroosh","doi":"10.1109/WACV.2019.00180","DOIUrl":null,"url":null,"abstract":"Still image emotion recognition has been receiving increasing attention in recent years due to the tremendous amount of social media content available on the Web. Opinion mining, visual emotion analysis, search and retrieval are among the application areas, to name a few. While there exist works on the subject, offering methods to detect image sentiment; i.e. recognizing the polarity of the image, less efforts focus on emotion analysis; i.e. dealing with recognizing the exact emotion aroused when exposed to certain visual stimuli. Main gaps tackled in this work include (1) lack of large-scale image datasets for deep learning of visual emotions and (2) lack of context-sensitive single-modality approaches in emotion analysis in the still image domain. In this paper, we introduce LUCFER (Pronounced LU-CI-FER), a dataset containing over 3.6M images, with 3-dimensional labels; i.e. emotion, context and valence. LUCFER, the largest dataset of the kind currently available, is collected using a novel data collection pipeline, proposed and implemented in this work. Moreover, we train a context-sensitive deep classifier using a novel multinomial classification technique proposed here via adding a dimensionality reduction layer to the CNN. Relying on our categorical approach to emotion recognition, we claim and show empirically that injecting context to our unified training process helps (1) achieve a more balanced precision and recall, and (2) boost performance, yielding an overall classification accuracy of 73.12% compared to 58.3% achieved in the closest work in the literature.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACV.2019.00180","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

Abstract

Still image emotion recognition has been receiving increasing attention in recent years due to the tremendous amount of social media content available on the Web. Opinion mining, visual emotion analysis, search and retrieval are among the application areas, to name a few. While there exist works on the subject, offering methods to detect image sentiment; i.e. recognizing the polarity of the image, less efforts focus on emotion analysis; i.e. dealing with recognizing the exact emotion aroused when exposed to certain visual stimuli. Main gaps tackled in this work include (1) lack of large-scale image datasets for deep learning of visual emotions and (2) lack of context-sensitive single-modality approaches in emotion analysis in the still image domain. In this paper, we introduce LUCFER (Pronounced LU-CI-FER), a dataset containing over 3.6M images, with 3-dimensional labels; i.e. emotion, context and valence. LUCFER, the largest dataset of the kind currently available, is collected using a novel data collection pipeline, proposed and implemented in this work. Moreover, we train a context-sensitive deep classifier using a novel multinomial classification technique proposed here via adding a dimensionality reduction layer to the CNN. Relying on our categorical approach to emotion recognition, we claim and show empirically that injecting context to our unified training process helps (1) achieve a more balanced precision and recall, and (2) boost performance, yielding an overall classification accuracy of 73.12% compared to 58.3% achieved in the closest work in the literature.

查看原文本刊更多论文

用于视觉情绪深度学习的大规模上下文敏感图像数据集

近年来，由于网络上有大量的社交媒体内容，静止图像情感识别受到越来越多的关注。意见挖掘，视觉情感分析，搜索和检索是其中的应用领域，仅举几例。虽然有关于这一主题的作品，提供了检测图像情感的方法;即识别图像的极性，较少关注情感分析;即，处理识别暴露于特定视觉刺激时所引起的确切情绪。在这项工作中解决的主要差距包括:(1)缺乏用于视觉情感深度学习的大规模图像数据集;(2)在静态图像领域的情感分析中缺乏上下文敏感的单模态方法。在本文中，我们引入LUCFER(发音为LU-CI-FER)，这是一个包含超过360万张图像的数据集，具有三维标签;即情感，语境和效价。LUCFER是目前可用的最大的数据集，使用一种新的数据收集管道收集，在这项工作中提出并实施。此外，我们使用本文提出的一种新的多项分类技术，通过在CNN上添加降维层来训练上下文敏感的深度分类器。依靠我们的分类方法来进行情感识别，我们声称并通过经验证明，在我们的统一训练过程中注入上下文有助于(1)实现更平衡的精度和召回率，(2)提高性能，产生73.12%的总体分类准确率，而文献中最接近的工作达到58.3%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE Winter Conference on Applications of Computer Vision (WACV)

自引率

0.00%

发文量