真实世界环境中的三属性感知器面部表情识别

IF 4.3 2区计算机科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Consumer Electronics Pub Date : 2024-12-18 DOI:10.1109/TCE.2024.3519514

Wei-Yen Hsu;Ting-Hsuan Chiang

{"title":"真实世界环境中的三属性感知器面部表情识别","authors":"Wei-Yen Hsu;Ting-Hsuan Chiang","doi":"10.1109/TCE.2024.3519514","DOIUrl":null,"url":null,"abstract":"Facial expression recognition (FER) has become a prominent research area due to its wide range of applications, such as human-robot interaction (HRI), driver state monitoring, and medical diagnosis. However, real-world environments pose significant challenges to FER, including occlusion, variations in lighting, and different angles. In this study, a novel triple-attribute perceptron network (TAPNet) is proposed to tackle the limited effectiveness of FER in real-world environments. TAPNet improves FER performance in real-world environments by effectively leveraging triple-attribute facial features from global, local, and critical subregions, thereby fully exploiting the diverse potential information provided by each facial attribute, similar to the human face perception mechanism that extracts both global and regional information. Specifically, the global facial perception (GFP) module emphasizes the most important facial features in the overall face by increasing the number of channels to preserve features and assigning weights to different channels. Additionally, the local facial perception (LFP) and critical facial perception (CFP) modules capture regional feature information from local and critical facial features, respectively, focusing on fine-grained regional features and minimizing interference from irrelevant regions during feature extraction. The experimental results indicate that the proposed TAPNet model achieves an accuracy of 90.77% on the RAF-DB dataset and 65.13% on the AffectNet-7 dataset. Moreover, this model also demonstrates promising FER performance compared to the state-of-the-art approaches on several real-world datasets.","PeriodicalId":13208,"journal":{"name":"IEEE Transactions on Consumer Electronics","volume":"71 1","pages":"608-620"},"PeriodicalIF":4.3000,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Triple-Attribute Perceptron Facial Expression Recognition in Real-World Environments\",\"authors\":\"Wei-Yen Hsu;Ting-Hsuan Chiang\",\"doi\":\"10.1109/TCE.2024.3519514\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Facial expression recognition (FER) has become a prominent research area due to its wide range of applications, such as human-robot interaction (HRI), driver state monitoring, and medical diagnosis. However, real-world environments pose significant challenges to FER, including occlusion, variations in lighting, and different angles. In this study, a novel triple-attribute perceptron network (TAPNet) is proposed to tackle the limited effectiveness of FER in real-world environments. TAPNet improves FER performance in real-world environments by effectively leveraging triple-attribute facial features from global, local, and critical subregions, thereby fully exploiting the diverse potential information provided by each facial attribute, similar to the human face perception mechanism that extracts both global and regional information. Specifically, the global facial perception (GFP) module emphasizes the most important facial features in the overall face by increasing the number of channels to preserve features and assigning weights to different channels. Additionally, the local facial perception (LFP) and critical facial perception (CFP) modules capture regional feature information from local and critical facial features, respectively, focusing on fine-grained regional features and minimizing interference from irrelevant regions during feature extraction. The experimental results indicate that the proposed TAPNet model achieves an accuracy of 90.77% on the RAF-DB dataset and 65.13% on the AffectNet-7 dataset. Moreover, this model also demonstrates promising FER performance compared to the state-of-the-art approaches on several real-world datasets.\",\"PeriodicalId\":13208,\"journal\":{\"name\":\"IEEE Transactions on Consumer Electronics\",\"volume\":\"71 1\",\"pages\":\"608-620\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Consumer Electronics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10806792/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Consumer Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10806792/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

面部表情识别因其在人机交互、驾驶员状态监测、医疗诊断等领域的广泛应用而成为一个突出的研究领域。然而，现实环境对FER提出了重大挑战，包括遮挡、光照变化和不同角度。在这项研究中，提出了一种新的三属性感知器网络（TAPNet）来解决在现实环境中FER有限的有效性。TAPNet通过有效地利用来自全局、局部和关键子区域的三属性面部特征，从而充分利用每个面部属性提供的多种潜在信息，从而提高了FER在现实环境中的性能，类似于人类面部感知机制，提取全局和区域信息。具体来说，全局面部感知（GFP）模块通过增加通道数量来保留特征，并为不同的通道分配权重，从而强调整个面部中最重要的面部特征。此外，局部面部感知（LFP）和关键面部感知（CFP）模块分别从局部和关键面部特征中捕获区域特征信息，专注于细粒度的区域特征，并在特征提取过程中最大限度地减少无关区域的干扰。实验结果表明，本文提出的TAPNet模型在RAF-DB数据集上的准确率为90.77%，在AffectNet-7数据集上的准确率为65.13%。此外，与几个真实数据集上最先进的方法相比，该模型还展示了有希望的FER性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Triple-Attribute Perceptron Facial Expression Recognition in Real-World Environments

Facial expression recognition (FER) has become a prominent research area due to its wide range of applications, such as human-robot interaction (HRI), driver state monitoring, and medical diagnosis. However, real-world environments pose significant challenges to FER, including occlusion, variations in lighting, and different angles. In this study, a novel triple-attribute perceptron network (TAPNet) is proposed to tackle the limited effectiveness of FER in real-world environments. TAPNet improves FER performance in real-world environments by effectively leveraging triple-attribute facial features from global, local, and critical subregions, thereby fully exploiting the diverse potential information provided by each facial attribute, similar to the human face perception mechanism that extracts both global and regional information. Specifically, the global facial perception (GFP) module emphasizes the most important facial features in the overall face by increasing the number of channels to preserve features and assigning weights to different channels. Additionally, the local facial perception (LFP) and critical facial perception (CFP) modules capture regional feature information from local and critical facial features, respectively, focusing on fine-grained regional features and minimizing interference from irrelevant regions during feature extraction. The experimental results indicate that the proposed TAPNet model achieves an accuracy of 90.77% on the RAF-DB dataset and 65.13% on the AffectNet-7 dataset. Moreover, this model also demonstrates promising FER performance compared to the state-of-the-art approaches on several real-world datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Consumer Electronics 工程技术-电信学

CiteScore

7.70

自引率

9.30%

发文量

审稿时长

3.3 months

期刊介绍： The main focus for the IEEE Transactions on Consumer Electronics is the engineering and research aspects of the theory, design, construction, manufacture or end use of mass market electronics, systems, software and services for consumers.