Structure Representation With Adaptive and Compact Facial Graph for Micro-Expression Recognition

IF 5

IEEE transactions on biometrics, behavior, and identity science Pub Date : 2024-10-14 DOI:10.1109/TBIOM.2024.3479333

Chunlei Li;Renwei Ba;Xueping Wang;Miao Yu;Xiao Li;Di Huang

{"title":"Structure Representation With Adaptive and Compact Facial Graph for Micro-Expression Recognition","authors":"Chunlei Li;Renwei Ba;Xueping Wang;Miao Yu;Xiao Li;Di Huang","doi":"10.1109/TBIOM.2024.3479333","DOIUrl":null,"url":null,"abstract":"The subtle and slight motions of micro-expressions (MEs) leave few effective features to micro-expression recognition (MER), making MER a challenging task. Existing works mainly focus on constructing strong representations from entire videos, individual frames, or redundant structural graphs, however, spatial structure feature learning of MEs leaves much space for further improvement. To solve the issue, this paper introduces a novel two-stream network for MER without any prior knowledge called Focusing on Few Discriminative Information Network (FFDIN). Specifically, in the temporal stream, the difference between the Apex and Onset frames is utilized as input to reduce redundant information and aggregate temporal information. Meanwhile, spatial attention is incorporated into the CNN stream to encourage the network to focus on salient features. In the structural stream, the Adaptively Select Strategy (ADSS) is proposed to automatically locate few effective regions of MEs by selecting the strong long-term dependent cropped patches and corresponding adjacency matrix. Then, the Graph Nodes Generation (GNG) module is designed to capture local and global information in tiny cropped patches and project the feature maps into graph nodes. Extensive experiments conducted on the CASME II, SAMM, and SMIC datasets demonstrate that the proposed network can achieve superior performance than the state-of-the-art methods.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 2","pages":"256-269"},"PeriodicalIF":5.0000,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10715688/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The subtle and slight motions of micro-expressions (MEs) leave few effective features to micro-expression recognition (MER), making MER a challenging task. Existing works mainly focus on constructing strong representations from entire videos, individual frames, or redundant structural graphs, however, spatial structure feature learning of MEs leaves much space for further improvement. To solve the issue, this paper introduces a novel two-stream network for MER without any prior knowledge called Focusing on Few Discriminative Information Network (FFDIN). Specifically, in the temporal stream, the difference between the Apex and Onset frames is utilized as input to reduce redundant information and aggregate temporal information. Meanwhile, spatial attention is incorporated into the CNN stream to encourage the network to focus on salient features. In the structural stream, the Adaptively Select Strategy (ADSS) is proposed to automatically locate few effective regions of MEs by selecting the strong long-term dependent cropped patches and corresponding adjacency matrix. Then, the Graph Nodes Generation (GNG) module is designed to capture local and global information in tiny cropped patches and project the feature maps into graph nodes. Extensive experiments conducted on the CASME II, SAMM, and SMIC datasets demonstrate that the proposed network can achieve superior performance than the state-of-the-art methods.

查看原文本刊更多论文

面向微表情识别的自适应紧凑面部图结构表示

微表情的细微动作使得微表情识别缺乏有效的特征，使得微表情识别成为一项具有挑战性的任务。现有的工作主要集中在从整个视频、单个帧或冗余结构图中构建强表征，而MEs的空间结构特征学习还有很大的改进空间。为了解决这一问题，本文提出了一种新的不需要任何先验知识的双流网络——聚焦少判别信息网络（focus on Few Discriminative Information network， FFDIN）。具体来说，在时间流中，利用顶点帧和起始帧之间的差异作为输入来减少冗余信息并聚合时间信息。同时，在CNN流中加入空间注意力，鼓励网络关注显著特征。在结构流中，提出了自适应选择策略（ADSS），通过选择强长期依赖的裁剪斑块和相应的邻接矩阵，自动定位MEs的少数有效区域。然后，设计图形节点生成（GNG）模块，以捕获微小裁剪补丁中的局部和全局信息，并将特征映射投影到图形节点中。在CASME II、SAMM和SMIC数据集上进行的大量实验表明，所提出的网络可以实现比最先进的方法更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on biometrics, behavior, and identity science

CiteScore

10.90

自引率

0.00%

发文量