Hard semantic mask strategy for automatic facial action unit recognition with teacher–student model

IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Zichen Liang, Haiying Xia, Yumei Tan, Shuxiang Song
{"title":"Hard semantic mask strategy for automatic facial action unit recognition with teacher–student model","authors":"Zichen Liang, Haiying Xia, Yumei Tan, Shuxiang Song","doi":"10.1007/s00530-024-01385-x","DOIUrl":null,"url":null,"abstract":"<p>Facial Action Coding System (FACS) is a widely used technique in affective computing, which defines a series of facial action units (AUs) corresponding to localized regions of the face. Fine-grained feature information of critical regions is crucial for accurate AU recognition. However, conventional random masking techniques used in Masked Image Modeling (MIM) often overlook the inherent symmetry of faces and the complex interrelationships among facial muscles, leading to a lack of critical local details and poor AU recognition performance. To address these limitations, we propose a novel teacher-student model-based MIM framework called Hard Semantic Masking Strategy Teacher–Student (HSMS-TS). Specifically, we first introduce a hard semantic mask strategy in the teacher model, aims to guide the student network to focus on learning fine-grained AU-related representations. Then, the student network utilizes the attention maps from the pretrained teacher model to generate a more challenging masking method from a predefined template, increasing the learning difficulty and helping the student acquire better AU-related representations. The experimental results on two publicly available datasets, i.e., BP4D and DISFA, show the effectiveness of our proposed method with exceptional performance. Code will be publicly available at http://github.com/lzichen/HSMS-TS.</p>","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"191 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01385-x","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Facial Action Coding System (FACS) is a widely used technique in affective computing, which defines a series of facial action units (AUs) corresponding to localized regions of the face. Fine-grained feature information of critical regions is crucial for accurate AU recognition. However, conventional random masking techniques used in Masked Image Modeling (MIM) often overlook the inherent symmetry of faces and the complex interrelationships among facial muscles, leading to a lack of critical local details and poor AU recognition performance. To address these limitations, we propose a novel teacher-student model-based MIM framework called Hard Semantic Masking Strategy Teacher–Student (HSMS-TS). Specifically, we first introduce a hard semantic mask strategy in the teacher model, aims to guide the student network to focus on learning fine-grained AU-related representations. Then, the student network utilizes the attention maps from the pretrained teacher model to generate a more challenging masking method from a predefined template, increasing the learning difficulty and helping the student acquire better AU-related representations. The experimental results on two publicly available datasets, i.e., BP4D and DISFA, show the effectiveness of our proposed method with exceptional performance. Code will be publicly available at http://github.com/lzichen/HSMS-TS.

Abstract Image

利用师生模型自动识别面部动作单元的硬语义掩码策略
面部动作编码系统(FACS)是情感计算中广泛使用的一种技术,它定义了一系列与面部局部区域相对应的面部动作单元(AU)。关键区域的精细特征信息对于准确识别 AU 至关重要。然而,掩蔽图像建模(MIM)中使用的传统随机掩蔽技术往往忽略了人脸固有的对称性和面部肌肉之间复杂的相互关系,从而导致缺乏关键的局部细节,AU 识别性能低下。为了解决这些局限性,我们提出了一种新颖的基于师生模型的 MIM 框架,称为 "硬语义屏蔽策略师生(HSMS-TS)"。具体来说,我们首先在教师模型中引入硬语义屏蔽策略,旨在引导学生网络专注于学习细粒度的非盟相关表征。然后,学生网络利用来自预训练教师模型的注意图,从预定义模板中生成更具挑战性的掩码方法,从而增加学习难度,帮助学生获得更好的非盟相关表征。在两个公开数据集(即 BP4D 和 DISFA)上的实验结果表明,我们提出的方法非常有效,而且性能优异。代码将在 http://github.com/lzichen/HSMS-TS 上公开。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Multimedia Systems
Multimedia Systems 工程技术-计算机:理论方法
CiteScore
5.40
自引率
7.70%
发文量
148
审稿时长
4.5 months
期刊介绍: This journal details innovative research ideas, emerging technologies, state-of-the-art methods and tools in all aspects of multimedia computing, communication, storage, and applications. It features theoretical, experimental, and survey articles.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信