InFER++:真实世界印度面部表情数据集

Syed Sameen Ahmad Rizvi;Aryan Seth;Jagat Sesh Challa;Pratik Narang
{"title":"InFER++:真实世界印度面部表情数据集","authors":"Syed Sameen Ahmad Rizvi;Aryan Seth;Jagat Sesh Challa;Pratik Narang","doi":"10.1109/OJCS.2024.3443511","DOIUrl":null,"url":null,"abstract":"Detecting facial expressions is a challenging task in the field of computer vision. Several datasets and algorithms have been proposed over the past two decades; however, deploying them in real-world, in-the-wild scenarios hampers the overall performance. This is because the training data does not completely represent socio-cultural and ethnic diversity; the majority of the datasets consist of American and Caucasian populations. On the contrary, in a diverse and heterogeneous population distribution like the Indian subcontinent, the need for a significantly large enough dataset representing all the ethnic groups is even more critical. To address this, we present InFER++, an India-specific, multi-ethnic, real-world, in-the-wild facial expression dataset consisting of seven basic expressions. To the best of our knowledge, this is the largest India-specific facial expression dataset. Our cross-dataset analysis of RAF-DB vs InFER++ shows that models trained on RAF-DB were not generalizable to ethnic datasets like InFER++. This is because the facial expressions change with respect to ethnic and socio-cultural factors. We also present LiteXpressionNet, a lightweight deep facial expression network that outperforms many existing lightweight models with considerably fewer FLOPs and parameters. The proposed model is inspired by MobileViTv2 architecture, which utilizes GhostNetv2 blocks to increase parametrization while reducing latency and FLOP requirements. The model is trained with a novel objective function that combines early learning regularization and symmetric cross-entropy loss to mitigate human uncertainties and annotation bias in most real-world facial expression datasets.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"5 ","pages":"406-417"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10636346","citationCount":"0","resultStr":"{\"title\":\"InFER++: Real-World Indian Facial Expression Dataset\",\"authors\":\"Syed Sameen Ahmad Rizvi;Aryan Seth;Jagat Sesh Challa;Pratik Narang\",\"doi\":\"10.1109/OJCS.2024.3443511\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Detecting facial expressions is a challenging task in the field of computer vision. Several datasets and algorithms have been proposed over the past two decades; however, deploying them in real-world, in-the-wild scenarios hampers the overall performance. This is because the training data does not completely represent socio-cultural and ethnic diversity; the majority of the datasets consist of American and Caucasian populations. On the contrary, in a diverse and heterogeneous population distribution like the Indian subcontinent, the need for a significantly large enough dataset representing all the ethnic groups is even more critical. To address this, we present InFER++, an India-specific, multi-ethnic, real-world, in-the-wild facial expression dataset consisting of seven basic expressions. To the best of our knowledge, this is the largest India-specific facial expression dataset. Our cross-dataset analysis of RAF-DB vs InFER++ shows that models trained on RAF-DB were not generalizable to ethnic datasets like InFER++. This is because the facial expressions change with respect to ethnic and socio-cultural factors. We also present LiteXpressionNet, a lightweight deep facial expression network that outperforms many existing lightweight models with considerably fewer FLOPs and parameters. The proposed model is inspired by MobileViTv2 architecture, which utilizes GhostNetv2 blocks to increase parametrization while reducing latency and FLOP requirements. The model is trained with a novel objective function that combines early learning regularization and symmetric cross-entropy loss to mitigate human uncertainties and annotation bias in most real-world facial expression datasets.\",\"PeriodicalId\":13205,\"journal\":{\"name\":\"IEEE Open Journal of the Computer Society\",\"volume\":\"5 \",\"pages\":\"406-417\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10636346\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of the Computer Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10636346/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Computer Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10636346/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

检测面部表情是计算机视觉领域一项极具挑战性的任务。在过去的二十年里,人们提出了多种数据集和算法;然而,将它们部署在真实世界的野外场景中会影响整体性能。这是因为训练数据并不能完全代表社会文化和种族的多样性;大多数数据集都是由美国人和白种人组成的。相反,在印度次大陆这样一个多元异质的人口分布地区,更需要一个足够大的数据集来代表所有的种族群体。为了解决这个问题,我们提出了 InFER++,这是一个印度特有的、多种族的、真实世界中的野外面部表情数据集,由七种基本表情组成。据我们所知,这是最大的印度特定面部表情数据集。我们对 RAF-DB 和 InFER++ 进行的跨数据集分析表明,在 RAF-DB 上训练的模型无法推广到像 InFER++ 这样的种族数据集。这是因为面部表情会因种族和社会文化因素而改变。我们还提出了 LiteXpressionNet,这是一种轻量级深度面部表情网络,其表现优于许多现有的轻量级模型,而且 FLOPs 和参数都少得多。我们提出的模型受到 MobileViTv2 架构的启发,该架构利用 GhostNetv2 块来提高参数化程度,同时降低延迟和 FLOP 要求。该模型采用新颖的目标函数进行训练,该函数结合了早期学习正则化和对称交叉熵损失,以减轻大多数真实世界面部表情数据集中的人为不确定性和注释偏差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
InFER++: Real-World Indian Facial Expression Dataset
Detecting facial expressions is a challenging task in the field of computer vision. Several datasets and algorithms have been proposed over the past two decades; however, deploying them in real-world, in-the-wild scenarios hampers the overall performance. This is because the training data does not completely represent socio-cultural and ethnic diversity; the majority of the datasets consist of American and Caucasian populations. On the contrary, in a diverse and heterogeneous population distribution like the Indian subcontinent, the need for a significantly large enough dataset representing all the ethnic groups is even more critical. To address this, we present InFER++, an India-specific, multi-ethnic, real-world, in-the-wild facial expression dataset consisting of seven basic expressions. To the best of our knowledge, this is the largest India-specific facial expression dataset. Our cross-dataset analysis of RAF-DB vs InFER++ shows that models trained on RAF-DB were not generalizable to ethnic datasets like InFER++. This is because the facial expressions change with respect to ethnic and socio-cultural factors. We also present LiteXpressionNet, a lightweight deep facial expression network that outperforms many existing lightweight models with considerably fewer FLOPs and parameters. The proposed model is inspired by MobileViTv2 architecture, which utilizes GhostNetv2 blocks to increase parametrization while reducing latency and FLOP requirements. The model is trained with a novel objective function that combines early learning regularization and symmetric cross-entropy loss to mitigate human uncertainties and annotation bias in most real-world facial expression datasets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
12.60
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信