基于改进的 MobileNetV3 的面部表情识别算法研究

IF 2.4 4区 计算机科学
Bin Jiang, Nanxing Li, Xiaomei Cui, Qiuwen Zhang, Huanlong Zhang, Zuhe Li, Weihua Liu
{"title":"基于改进的 MobileNetV3 的面部表情识别算法研究","authors":"Bin Jiang, Nanxing Li, Xiaomei Cui, Qiuwen Zhang, Huanlong Zhang, Zuhe Li, Weihua Liu","doi":"10.1186/s13640-024-00638-z","DOIUrl":null,"url":null,"abstract":"<p>Aiming at the problem that face images are easily interfered by occlusion factors in uncontrollable environments, and the complex structure of traditional convolutional neural networks leads to low expression recognition rates, slow network convergence speed, and long network training time, an improved lightweight convolutional neural network is proposed for facial expression recognition algorithm. First, the dilation convolution is introduced into the shortcut connection of the inverted residual structure in the MobileNetV3 network to expand the receptive field of the convolution kernel and reduce the loss of expression features. Then, the channel attention mechanism SENet in the network is replaced by the two-dimensional (channel and spatial) attention mechanism SimAM introduced without parameters to reduce the network parameters. Finally, in the normalization operation, the Batch Normalization of the backbone network is replaced with Group Normalization, which is stable at various batch sizes, to reduce errors caused by processing small batches of data. Experimental results on RaFD, FER2013, and FER2013Plus face expression data sets show that the network reduces the training times while maintaining network accuracy, improves network convergence speed, and has good convergence effects.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on facial expression recognition algorithm based on improved MobileNetV3\",\"authors\":\"Bin Jiang, Nanxing Li, Xiaomei Cui, Qiuwen Zhang, Huanlong Zhang, Zuhe Li, Weihua Liu\",\"doi\":\"10.1186/s13640-024-00638-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Aiming at the problem that face images are easily interfered by occlusion factors in uncontrollable environments, and the complex structure of traditional convolutional neural networks leads to low expression recognition rates, slow network convergence speed, and long network training time, an improved lightweight convolutional neural network is proposed for facial expression recognition algorithm. First, the dilation convolution is introduced into the shortcut connection of the inverted residual structure in the MobileNetV3 network to expand the receptive field of the convolution kernel and reduce the loss of expression features. Then, the channel attention mechanism SENet in the network is replaced by the two-dimensional (channel and spatial) attention mechanism SimAM introduced without parameters to reduce the network parameters. Finally, in the normalization operation, the Batch Normalization of the backbone network is replaced with Group Normalization, which is stable at various batch sizes, to reduce errors caused by processing small batches of data. Experimental results on RaFD, FER2013, and FER2013Plus face expression data sets show that the network reduces the training times while maintaining network accuracy, improves network convergence speed, and has good convergence effects.</p>\",\"PeriodicalId\":49322,\"journal\":{\"name\":\"Eurasip Journal on Image and Video Processing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2024-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Eurasip Journal on Image and Video Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1186/s13640-024-00638-z\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Eurasip Journal on Image and Video Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s13640-024-00638-z","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

针对人脸图像在不可控环境下易受遮挡因素干扰,以及传统卷积神经网络结构复杂导致表情识别率低、网络收敛速度慢、网络训练时间长等问题,提出了一种改进的轻量级卷积神经网络用于人脸表情识别算法。首先,在 MobileNetV3 网络的倒残差结构的快捷连接中引入扩张卷积,以扩大卷积核的感受野,减少表情特征的损失。然后,将网络中的信道注意机制 SENet 替换为无参数引入的二维(信道和空间)注意机制 SimAM,以减少网络参数。最后,在归一化操作中,将骨干网络的批归一化(Batch Normalization)替换为在各种批量大小下都很稳定的组归一化(Group Normalization),以减少处理小批量数据时产生的误差。在 RaFD、FER2013 和 FER2013Plus 人脸表情数据集上的实验结果表明,该网络在保持网络准确性的同时减少了训练时间,提高了网络收敛速度,具有良好的收敛效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Research on facial expression recognition algorithm based on improved MobileNetV3

Research on facial expression recognition algorithm based on improved MobileNetV3

Aiming at the problem that face images are easily interfered by occlusion factors in uncontrollable environments, and the complex structure of traditional convolutional neural networks leads to low expression recognition rates, slow network convergence speed, and long network training time, an improved lightweight convolutional neural network is proposed for facial expression recognition algorithm. First, the dilation convolution is introduced into the shortcut connection of the inverted residual structure in the MobileNetV3 network to expand the receptive field of the convolution kernel and reduce the loss of expression features. Then, the channel attention mechanism SENet in the network is replaced by the two-dimensional (channel and spatial) attention mechanism SimAM introduced without parameters to reduce the network parameters. Finally, in the normalization operation, the Batch Normalization of the backbone network is replaced with Group Normalization, which is stable at various batch sizes, to reduce errors caused by processing small batches of data. Experimental results on RaFD, FER2013, and FER2013Plus face expression data sets show that the network reduces the training times while maintaining network accuracy, improves network convergence speed, and has good convergence effects.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Eurasip Journal on Image and Video Processing
Eurasip Journal on Image and Video Processing Engineering-Electrical and Electronic Engineering
CiteScore
7.10
自引率
0.00%
发文量
23
审稿时长
6.8 months
期刊介绍: EURASIP Journal on Image and Video Processing is intended for researchers from both academia and industry, who are active in the multidisciplinary field of image and video processing. The scope of the journal covers all theoretical and practical aspects of the domain, from basic research to development of application.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信