基于关键区域时空特征提取的微表情识别

IF 1.7

Computers, materials & continua Pub Date : 2023-01-01 DOI:10.32604/cmc.2023.037216

Wenqiu Zhu, Yongsheng Li, Qiang Liu, Zhigao Zeng

{"title":"基于关键区域时空特征提取的微表情识别","authors":"Wenqiu Zhu, Yongsheng Li, Qiang Liu, Zhigao Zeng","doi":"10.32604/cmc.2023.037216","DOIUrl":null,"url":null,"abstract":"Aiming at the problems of short duration, low intensity, and difficult detection of micro-expressions (MEs), the global and local features of ME video frames are extracted by combining spatial feature extraction and temporal feature extraction. Based on traditional convolution neural network (CNN) and long short-term memory (LSTM), a recognition method combining global identification attention network (GIA), block identification attention network (BIA) and bi-directional long short-term memory (Bi-LSTM) is proposed. In the BIA, the ME video frame will be cropped, and the training will be carried out by cropping into 24 identification blocks (IBs), 10 IBs and uncropped IBs. To alleviate the overfitting problem in training, we first extract the basic features of the pre-processed sequence through the transfer learning layer, and then extract the global and local spatial features of the output data through the GIA layer and the BIA layer, respectively. In the BIA layer, the input data will be cropped into local feature vectors with attention weights to extract the local features of the ME frames; in the GIA layer, the global features of the ME frames will be extracted. Finally, after fusing the global and local feature vectors, the ME time-series information is extracted by Bi-LSTM. The experimental results show that using IBs can significantly improve the model's ability to extract subtle facial features, and the model works best when 10 IBs are used.","PeriodicalId":93535,"journal":{"name":"Computers, materials & continua","volume":"17 1","pages":"0"},"PeriodicalIF":1.7000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Micro-Expression Recognition Based on Spatio-Temporal Feature Extraction of Key Regions\",\"authors\":\"Wenqiu Zhu, Yongsheng Li, Qiang Liu, Zhigao Zeng\",\"doi\":\"10.32604/cmc.2023.037216\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aiming at the problems of short duration, low intensity, and difficult detection of micro-expressions (MEs), the global and local features of ME video frames are extracted by combining spatial feature extraction and temporal feature extraction. Based on traditional convolution neural network (CNN) and long short-term memory (LSTM), a recognition method combining global identification attention network (GIA), block identification attention network (BIA) and bi-directional long short-term memory (Bi-LSTM) is proposed. In the BIA, the ME video frame will be cropped, and the training will be carried out by cropping into 24 identification blocks (IBs), 10 IBs and uncropped IBs. To alleviate the overfitting problem in training, we first extract the basic features of the pre-processed sequence through the transfer learning layer, and then extract the global and local spatial features of the output data through the GIA layer and the BIA layer, respectively. In the BIA layer, the input data will be cropped into local feature vectors with attention weights to extract the local features of the ME frames; in the GIA layer, the global features of the ME frames will be extracted. Finally, after fusing the global and local feature vectors, the ME time-series information is extracted by Bi-LSTM. The experimental results show that using IBs can significantly improve the model's ability to extract subtle facial features, and the model works best when 10 IBs are used.\",\"PeriodicalId\":93535,\"journal\":{\"name\":\"Computers, materials & continua\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers, materials & continua\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32604/cmc.2023.037216\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers, materials & continua","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32604/cmc.2023.037216","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

针对微表情持续时间短、强度低、难以检测的问题，采用空间特征提取和时间特征提取相结合的方法提取微表情视频帧的全局和局部特征。在传统卷积神经网络(CNN)和长短期记忆(LSTM)的基础上，提出了一种将全局识别注意网络(GIA)、块识别注意网络(BIA)和双向长短期记忆(Bi-LSTM)相结合的识别方法。在BIA中，ME视频帧将被裁剪，训练将被裁剪成24个识别块(IBs)， 10个IBs和未裁剪的IBs。为了缓解训练中的过拟合问题，我们首先通过迁移学习层提取预处理序列的基本特征，然后分别通过GIA层和BIA层提取输出数据的全局和局部空间特征。在BIA层，将输入数据裁剪成具有注意权值的局部特征向量，提取ME帧的局部特征;在GIA层中，提取ME帧的全局特征。最后，在融合全局和局部特征向量后，利用Bi-LSTM提取ME时间序列信息。实验结果表明，使用IBs可以显著提高模型提取细微面部特征的能力，当使用10个IBs时，模型效果最好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Micro-Expression Recognition Based on Spatio-Temporal Feature Extraction of Key Regions

Aiming at the problems of short duration, low intensity, and difficult detection of micro-expressions (MEs), the global and local features of ME video frames are extracted by combining spatial feature extraction and temporal feature extraction. Based on traditional convolution neural network (CNN) and long short-term memory (LSTM), a recognition method combining global identification attention network (GIA), block identification attention network (BIA) and bi-directional long short-term memory (Bi-LSTM) is proposed. In the BIA, the ME video frame will be cropped, and the training will be carried out by cropping into 24 identification blocks (IBs), 10 IBs and uncropped IBs. To alleviate the overfitting problem in training, we first extract the basic features of the pre-processed sequence through the transfer learning layer, and then extract the global and local spatial features of the output data through the GIA layer and the BIA layer, respectively. In the BIA layer, the input data will be cropped into local feature vectors with attention weights to extract the local features of the ME frames; in the GIA layer, the global features of the ME frames will be extracted. Finally, after fusing the global and local feature vectors, the ME time-series information is extracted by Bi-LSTM. The experimental results show that using IBs can significantly improve the model's ability to extract subtle facial features, and the model works best when 10 IBs are used.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers, materials & continua

自引率

0.00%

发文量