基于双输入CNN架构的端到端真实视频微表情识别系统

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-05-10 DOI:10.1016/j.eswa.2025.128062

Y.S. Gan , Kun-Hong Liu , Min-Huan Wu , Gen-Bing Liong , Sze-Teng Liong

{"title":"基于双输入CNN架构的端到端真实视频微表情识别系统","authors":"Y.S. Gan , Kun-Hong Liu , Min-Huan Wu , Gen-Bing Liong , Sze-Teng Liong","doi":"10.1016/j.eswa.2025.128062","DOIUrl":null,"url":null,"abstract":"<div><div>Micro-expression (ME) recognition reveals nonverbal emotions through subtle, involuntary facial muscle movements. However, the development and commercialization of ME recognition systems have been hindered by the lack of databases that accurately reflect real-world conditions. This study addresses this challenge by proposing a robust end-to-end system designed to operate effectively in unconstrained environments. Existing methods typically rely on a single apex frame, which may be unreliable due to noise, occlusions, or lighting variations. To address these issues, a 3D facial reconstruction technique is applied as a pre-processing step to normalize pose and lighting. A novel dual-peak frame detection strategy is then introduced to extract two expressive optical flow frames, reducing the impact of noise from any single frame. Finally, a Shallow and Small-size Dual-input (SSD) CNN architecture is designed to jointly process the two frames for improved emotion classification. The proposed system achieves strong performance on the challenging in-the-wild MEVIEW dataset, with accuracy and F1-score of 75 % and 77.68 %, respectively. Comprehensive evaluations further validate the effectiveness of the pipeline, highlighting its potential for real-world ME recognition applications.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"286 ","pages":"Article 128062"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An improved end-to-end micro-expression recognition system for real-world videos via dual-input CNN architecture\",\"authors\":\"Y.S. Gan , Kun-Hong Liu , Min-Huan Wu , Gen-Bing Liong , Sze-Teng Liong\",\"doi\":\"10.1016/j.eswa.2025.128062\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Micro-expression (ME) recognition reveals nonverbal emotions through subtle, involuntary facial muscle movements. However, the development and commercialization of ME recognition systems have been hindered by the lack of databases that accurately reflect real-world conditions. This study addresses this challenge by proposing a robust end-to-end system designed to operate effectively in unconstrained environments. Existing methods typically rely on a single apex frame, which may be unreliable due to noise, occlusions, or lighting variations. To address these issues, a 3D facial reconstruction technique is applied as a pre-processing step to normalize pose and lighting. A novel dual-peak frame detection strategy is then introduced to extract two expressive optical flow frames, reducing the impact of noise from any single frame. Finally, a Shallow and Small-size Dual-input (SSD) CNN architecture is designed to jointly process the two frames for improved emotion classification. The proposed system achieves strong performance on the challenging in-the-wild MEVIEW dataset, with accuracy and F1-score of 75 % and 77.68 %, respectively. Comprehensive evaluations further validate the effectiveness of the pipeline, highlighting its potential for real-world ME recognition applications.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"286 \",\"pages\":\"Article 128062\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425016835\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425016835","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

微表情（ME）识别通过细微的、不自觉的面部肌肉运动来揭示非语言情绪。然而，ME识别系统的开发和商业化一直受到缺乏准确反映现实世界条件的数据库的阻碍。本研究通过提出一个健壮的端到端系统来解决这一挑战，该系统旨在在不受约束的环境中有效运行。现有的方法通常依赖于单个顶点框架，由于噪声、遮挡或光照变化，这可能不可靠。为了解决这些问题，应用3D面部重建技术作为预处理步骤来规范姿态和照明。然后引入了一种新的双峰帧检测策略，提取两个具有表现力的光流帧，减少了单帧噪声的影响。最后，设计了一种浅小尺寸双输入（SSD） CNN架构，对两帧进行联合处理，以提高情感分类。该系统在具有挑战性的野外MEVIEW数据集上取得了较好的性能，准确率和f1分数分别达到75%和77.68%。综合评估进一步验证了该管道的有效性，突出了其在实际ME识别应用中的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An improved end-to-end micro-expression recognition system for real-world videos via dual-input CNN architecture

Micro-expression (ME) recognition reveals nonverbal emotions through subtle, involuntary facial muscle movements. However, the development and commercialization of ME recognition systems have been hindered by the lack of databases that accurately reflect real-world conditions. This study addresses this challenge by proposing a robust end-to-end system designed to operate effectively in unconstrained environments. Existing methods typically rely on a single apex frame, which may be unreliable due to noise, occlusions, or lighting variations. To address these issues, a 3D facial reconstruction technique is applied as a pre-processing step to normalize pose and lighting. A novel dual-peak frame detection strategy is then introduced to extract two expressive optical flow frames, reducing the impact of noise from any single frame. Finally, a Shallow and Small-size Dual-input (SSD) CNN architecture is designed to jointly process the two frames for improved emotion classification. The proposed system achieves strong performance on the challenging in-the-wild MEVIEW dataset, with accuracy and F1-score of 75 % and 77.68 %, respectively. Comprehensive evaluations further validate the effectiveness of the pipeline, highlighting its potential for real-world ME recognition applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.