基于改进的 MobileNetV3 的面部表情识别算法研究

IF 1.8 4区计算机科学

Eurasip Journal on Image and Video Processing Pub Date : 2024-08-22 DOI:10.1186/s13640-024-00638-z

Bin Jiang, Nanxing Li, Xiaomei Cui, Qiuwen Zhang, Huanlong Zhang, Zuhe Li, Weihua Liu

{"title":"基于改进的 MobileNetV3 的面部表情识别算法研究","authors":"Bin Jiang, Nanxing Li, Xiaomei Cui, Qiuwen Zhang, Huanlong Zhang, Zuhe Li, Weihua Liu","doi":"10.1186/s13640-024-00638-z","DOIUrl":null,"url":null,"abstract":"<p>Aiming at the problem that face images are easily interfered by occlusion factors in uncontrollable environments, and the complex structure of traditional convolutional neural networks leads to low expression recognition rates, slow network convergence speed, and long network training time, an improved lightweight convolutional neural network is proposed for facial expression recognition algorithm. First, the dilation convolution is introduced into the shortcut connection of the inverted residual structure in the MobileNetV3 network to expand the receptive field of the convolution kernel and reduce the loss of expression features. Then, the channel attention mechanism SENet in the network is replaced by the two-dimensional (channel and spatial) attention mechanism SimAM introduced without parameters to reduce the network parameters. Finally, in the normalization operation, the Batch Normalization of the backbone network is replaced with Group Normalization, which is stable at various batch sizes, to reduce errors caused by processing small batches of data. Experimental results on RaFD, FER2013, and FER2013Plus face expression data sets show that the network reduces the training times while maintaining network accuracy, improves network convergence speed, and has good convergence effects.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"43 1","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on facial expression recognition algorithm based on improved MobileNetV3\",\"authors\":\"Bin Jiang, Nanxing Li, Xiaomei Cui, Qiuwen Zhang, Huanlong Zhang, Zuhe Li, Weihua Liu\",\"doi\":\"10.1186/s13640-024-00638-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Aiming at the problem that face images are easily interfered by occlusion factors in uncontrollable environments, and the complex structure of traditional convolutional neural networks leads to low expression recognition rates, slow network convergence speed, and long network training time, an improved lightweight convolutional neural network is proposed for facial expression recognition algorithm. First, the dilation convolution is introduced into the shortcut connection of the inverted residual structure in the MobileNetV3 network to expand the receptive field of the convolution kernel and reduce the loss of expression features. Then, the channel attention mechanism SENet in the network is replaced by the two-dimensional (channel and spatial) attention mechanism SimAM introduced without parameters to reduce the network parameters. Finally, in the normalization operation, the Batch Normalization of the backbone network is replaced with Group Normalization, which is stable at various batch sizes, to reduce errors caused by processing small batches of data. Experimental results on RaFD, FER2013, and FER2013Plus face expression data sets show that the network reduces the training times while maintaining network accuracy, improves network convergence speed, and has good convergence effects.</p>\",\"PeriodicalId\":49322,\"journal\":{\"name\":\"Eurasip Journal on Image and Video Processing\",\"volume\":\"43 1\",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2024-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Eurasip Journal on Image and Video Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1186/s13640-024-00638-z\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Eurasip Journal on Image and Video Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s13640-024-00638-z","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

针对人脸图像在不可控环境下易受遮挡因素干扰，以及传统卷积神经网络结构复杂导致表情识别率低、网络收敛速度慢、网络训练时间长等问题，提出了一种改进的轻量级卷积神经网络用于人脸表情识别算法。首先，在 MobileNetV3 网络的倒残差结构的快捷连接中引入扩张卷积，以扩大卷积核的感受野，减少表情特征的损失。然后，将网络中的信道注意机制 SENet 替换为无参数引入的二维（信道和空间）注意机制 SimAM，以减少网络参数。最后，在归一化操作中，将骨干网络的批归一化（Batch Normalization）替换为在各种批量大小下都很稳定的组归一化（Group Normalization），以减少处理小批量数据时产生的误差。在 RaFD、FER2013 和 FER2013Plus 人脸表情数据集上的实验结果表明，该网络在保持网络准确性的同时减少了训练时间，提高了网络收敛速度，具有良好的收敛效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Research on facial expression recognition algorithm based on improved MobileNetV3

查看原文本刊更多论文

Research on facial expression recognition algorithm based on improved MobileNetV3

Aiming at the problem that face images are easily interfered by occlusion factors in uncontrollable environments, and the complex structure of traditional convolutional neural networks leads to low expression recognition rates, slow network convergence speed, and long network training time, an improved lightweight convolutional neural network is proposed for facial expression recognition algorithm. First, the dilation convolution is introduced into the shortcut connection of the inverted residual structure in the MobileNetV3 network to expand the receptive field of the convolution kernel and reduce the loss of expression features. Then, the channel attention mechanism SENet in the network is replaced by the two-dimensional (channel and spatial) attention mechanism SimAM introduced without parameters to reduce the network parameters. Finally, in the normalization operation, the Batch Normalization of the backbone network is replaced with Group Normalization, which is stable at various batch sizes, to reduce errors caused by processing small batches of data. Experimental results on RaFD, FER2013, and FER2013Plus face expression data sets show that the network reduces the training times while maintaining network accuracy, improves network convergence speed, and has good convergence effects.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Eurasip Journal on Image and Video Processing Engineering-Electrical and Electronic Engineering

CiteScore

7.10

自引率

0.00%

发文量

审稿时长

6.8 months

期刊介绍： EURASIP Journal on Image and Video Processing is intended for researchers from both academia and industry, who are active in the multidisciplinary field of image and video processing. The scope of the journal covers all theoretical and practical aspects of the domain, from basic research to development of application.