{"title":"CmdVIT: A Voluntary Facial Expression Recognition Model for Complex Mental Disorders","authors":"Jiayu Ye;Yanhong Yu;Qingxiang Wang;Guolong Liu;Wentao Li;An Zeng;Yiqun Zhang;Yang Liu;Yunshao Zheng","doi":"10.1109/TIP.2025.3567825","DOIUrl":null,"url":null,"abstract":"Facial Expression Recognition (FER) is a critical method for evaluating the emotional states of patients with mental disorders, playing a significant role in treatment monitoring. However, due to privacy constraints, facial expression data from patients with mental disorders is severely limited. Additionally, the more complex inter-class and intra-class similarities compared to healthy individuals make accurate recognition of facial expressions challenging. Therefore, we propose a Voluntary Facial Expression Mimicry (VFEM) experiment, which collected facial expression data from schizophrenia, depression, and anxiety. This experiment establishes the first dataset designed for facial expression recognition tasks exclusively composed of patients with mental disorders. Simultaneously, based on VFEM, we propose a Vision Transformer FER model tailored for Complex mental disorder patients (CmdVIT). CmdVIT integrates crucial facial expression features through both explicit and implicit mechanisms, including explicit visual center positional encoding and implicit sparse attention center loss function. These two key components enhance positional information and minimize the facial feature space distance between conventional attention and critical attention, effectively suppressing inter-class and intra-class similarities. In various FER tasks for different mental disorders in VFEM, CmdVIT achieves more competitive performance compared to contemporary benchmark models. Our works are available at <uri>https://github.com/yjy-97/CmdVIT</uri>.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"3013-3024"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11003429/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Facial Expression Recognition (FER) is a critical method for evaluating the emotional states of patients with mental disorders, playing a significant role in treatment monitoring. However, due to privacy constraints, facial expression data from patients with mental disorders is severely limited. Additionally, the more complex inter-class and intra-class similarities compared to healthy individuals make accurate recognition of facial expressions challenging. Therefore, we propose a Voluntary Facial Expression Mimicry (VFEM) experiment, which collected facial expression data from schizophrenia, depression, and anxiety. This experiment establishes the first dataset designed for facial expression recognition tasks exclusively composed of patients with mental disorders. Simultaneously, based on VFEM, we propose a Vision Transformer FER model tailored for Complex mental disorder patients (CmdVIT). CmdVIT integrates crucial facial expression features through both explicit and implicit mechanisms, including explicit visual center positional encoding and implicit sparse attention center loss function. These two key components enhance positional information and minimize the facial feature space distance between conventional attention and critical attention, effectively suppressing inter-class and intra-class similarities. In various FER tasks for different mental disorders in VFEM, CmdVIT achieves more competitive performance compared to contemporary benchmark models. Our works are available at https://github.com/yjy-97/CmdVIT.