Multi-loss, feature fusion and improved top-two-voting ensemble for facial expression recognition in the wild

IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Guangyao Zhou , Yuanlun Xie , Yiqin Fu , Zhaokun Wang
{"title":"Multi-loss, feature fusion and improved top-two-voting ensemble for facial expression recognition in the wild","authors":"Guangyao Zhou ,&nbsp;Yuanlun Xie ,&nbsp;Yiqin Fu ,&nbsp;Zhaokun Wang","doi":"10.1016/j.neunet.2024.106937","DOIUrl":null,"url":null,"abstract":"<div><div>Facial expression recognition (FER) in the wild is a challenging pattern recognition task affected by the images’ low quality and has attracted broad interest in computer vision. Existing FER methods failed to obtain sufficient accuracy to support the practical applications, especially in scenarios with low fault tolerance, which limits the adaptability of FER. Targeting exploring the possibility of further improving the accuracy of FER in the wild, this paper proposes a novel single model named R18+FAML and an ensemble model named R18+FAML-FGA-T2V, which applies intra-feature fusion within a single network, feature fusion among multiple networks, and the ensemble decision strategy. Based on the backbone of ResNet18 (R18), R18+FAML combines internal feature fusion and three attention blocks, as well as uses multiple loss functions (FAML) to improve the diversity of the feature extraction. To effectively integrate feature extractors from multiple networks, we propose feature fusion among networks based on the genetic algorithm (FGA). Comprehensively considering and utilizing more classification information, we propose an ensemble strategy, i.e., the improved top-two-voting (T2V) of multiple networks with the same structure. Combining the above strategies, R18+FAML-FGA-T2V can focus on the main expression-aware areas by integrating interest areas of multiple networks. From experiments on three challenging FER datasets in the wild including RAF-DB, AffectNet-8 and AffectNet-7, our single model R18+FAML and ensemble model R18+FAML-FGA-T2V achieve the accuracies of <span><math><mrow><mfenced><mrow><mn>90</mn><mo>.</mo><mn>32</mn><mo>,</mo><mn>62</mn><mo>.</mo><mn>17</mn><mo>,</mo><mn>65</mn><mo>.</mo><mn>83</mn></mrow></mfenced><mtext>%</mtext></mrow></math></span> and <span><math><mrow><mfenced><mrow><mn>91</mn><mo>.</mo><mn>59</mn><mo>,</mo><mn>63</mn><mo>.</mo><mn>27</mn><mo>,</mo><mn>66</mn><mo>.</mo><mn>63</mn></mrow></mfenced><mtext>%</mtext></mrow></math></span> respectively, both achieving the state-of-the-art results.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"183 ","pages":"Article 106937"},"PeriodicalIF":6.0000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608024008669","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Facial expression recognition (FER) in the wild is a challenging pattern recognition task affected by the images’ low quality and has attracted broad interest in computer vision. Existing FER methods failed to obtain sufficient accuracy to support the practical applications, especially in scenarios with low fault tolerance, which limits the adaptability of FER. Targeting exploring the possibility of further improving the accuracy of FER in the wild, this paper proposes a novel single model named R18+FAML and an ensemble model named R18+FAML-FGA-T2V, which applies intra-feature fusion within a single network, feature fusion among multiple networks, and the ensemble decision strategy. Based on the backbone of ResNet18 (R18), R18+FAML combines internal feature fusion and three attention blocks, as well as uses multiple loss functions (FAML) to improve the diversity of the feature extraction. To effectively integrate feature extractors from multiple networks, we propose feature fusion among networks based on the genetic algorithm (FGA). Comprehensively considering and utilizing more classification information, we propose an ensemble strategy, i.e., the improved top-two-voting (T2V) of multiple networks with the same structure. Combining the above strategies, R18+FAML-FGA-T2V can focus on the main expression-aware areas by integrating interest areas of multiple networks. From experiments on three challenging FER datasets in the wild including RAF-DB, AffectNet-8 and AffectNet-7, our single model R18+FAML and ensemble model R18+FAML-FGA-T2V achieve the accuracies of 90.32,62.17,65.83% and 91.59,63.27,66.63% respectively, both achieving the state-of-the-art results.
基于多损失、特征融合和改进的前两名投票集成的野生面部表情识别
面部表情识别是一项具有挑战性的模式识别任务,受到图像质量低下的影响,引起了计算机视觉领域的广泛关注。现有的滤波方法无法获得足够的精度来支持实际应用,特别是在容错能力较低的情况下,限制了滤波的适应性。为了探索进一步提高野外特征识别准确率的可能性,本文提出了一种新的单一模型R18+FAML和集成模型R18+FAML- fga - t2v,分别应用了单网络内特征融合、多网络间特征融合和集成决策策略。R18+FAML以ResNet18 (R18)为骨干,结合内部特征融合和三个注意块,并使用多重损失函数(FAML)提高特征提取的多样性。为了有效整合多个网络的特征提取器,提出了基于遗传算法的网络间特征融合。综合考虑和利用更多的分类信息,提出了一种集成策略,即具有相同结构的多个网络的改进top-two-voting (T2V)。结合上述策略,R18+FAML-FGA-T2V可以通过整合多个网络的兴趣区域,专注于主要的表达感知区域。通过在RAF-DB、AffectNet-8和AffectNet-7三个具有挑战性的野外数据集上的实验,我们的单一模型R18+FAML和集成模型R18+FAML- fga - t2v的准确率分别达到了90.32、62.17、65.83%和91.59、63.27、66.63%,均达到了最先进的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Neural Networks
Neural Networks 工程技术-计算机:人工智能
CiteScore
13.90
自引率
7.70%
发文量
425
审稿时长
67 days
期刊介绍: Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信