用于野生面部表情识别的判别性多尺度特征提取网络

IF 3.4 3区工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY

Measurement Science and Technology Pub Date : 2024-01-04 DOI:10.1088/1361-6501/ad191c

Xiaoyu Wen, Juxiang Zhou, Jianhou Gan, Sen Luo

{"title":"用于野生面部表情识别的判别性多尺度特征提取网络","authors":"Xiaoyu Wen, Juxiang Zhou, Jianhou Gan, Sen Luo","doi":"10.1088/1361-6501/ad191c","DOIUrl":null,"url":null,"abstract":"Driven by advancements in deep learning technologies, substantial progress has been achieved in the field of facial expression recognition over the past decade, while challenges remain brought about by occlusions, pose variations and subtle expression differences in unconstrained (wild) scenarios. Therefore, a novel multiscale feature extraction method is proposed in this paper, that leverages convolutional neural networks to simultaneously extract deep semantic features and shallow geometric features. Through the mechanism of channel-wise self-attention, prominent features are further extracted and compressed, preserving advantageous features for distinction and thereby reducing the impact of occlusions and pose variations on expression recognition. Meanwhile, inspired by the large cosine margin concept used in face recognition, a center cosine loss function is proposed to avoid the misclassification caused by the underlying interclass similarity and substantial intra-class feature variations in the task of expression recognition. This function is designed to enhance the classification performance of the network through making the distribution of samples within the same class more compact and that between different classes sparser. The proposed method is benchmarked against several advanced baseline models on three mainstream wild datasets and two datasets that present realistic occlusion and pose variation challenges. Accuracies of 89.63%, 61.82%, and 91.15% are achieved on RAF-DB, AffectNet and FERPlus, respectively, demonstrating the greater robustness and reliability of this method compared to the state-of-the-art alternatives in the real world.","PeriodicalId":18526,"journal":{"name":"Measurement Science and Technology","volume":"42 11","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A discriminative multiscale feature extraction network for facial expression recognition in the wild\",\"authors\":\"Xiaoyu Wen, Juxiang Zhou, Jianhou Gan, Sen Luo\",\"doi\":\"10.1088/1361-6501/ad191c\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Driven by advancements in deep learning technologies, substantial progress has been achieved in the field of facial expression recognition over the past decade, while challenges remain brought about by occlusions, pose variations and subtle expression differences in unconstrained (wild) scenarios. Therefore, a novel multiscale feature extraction method is proposed in this paper, that leverages convolutional neural networks to simultaneously extract deep semantic features and shallow geometric features. Through the mechanism of channel-wise self-attention, prominent features are further extracted and compressed, preserving advantageous features for distinction and thereby reducing the impact of occlusions and pose variations on expression recognition. Meanwhile, inspired by the large cosine margin concept used in face recognition, a center cosine loss function is proposed to avoid the misclassification caused by the underlying interclass similarity and substantial intra-class feature variations in the task of expression recognition. This function is designed to enhance the classification performance of the network through making the distribution of samples within the same class more compact and that between different classes sparser. The proposed method is benchmarked against several advanced baseline models on three mainstream wild datasets and two datasets that present realistic occlusion and pose variation challenges. Accuracies of 89.63%, 61.82%, and 91.15% are achieved on RAF-DB, AffectNet and FERPlus, respectively, demonstrating the greater robustness and reliability of this method compared to the state-of-the-art alternatives in the real world.\",\"PeriodicalId\":18526,\"journal\":{\"name\":\"Measurement Science and Technology\",\"volume\":\"42 11\",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-01-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Measurement Science and Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1088/1361-6501/ad191c\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Measurement Science and Technology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1088/1361-6501/ad191c","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

在深度学习技术进步的推动下，过去十年来面部表情识别领域取得了长足的进步，但在无约束（野生）场景下，遮挡、姿势变化和细微表情差异带来的挑战依然存在。因此，本文提出了一种新颖的多尺度特征提取方法，利用卷积神经网络同时提取深层语义特征和浅层几何特征。通过信道自注意机制，进一步提取和压缩突出特征，保留用于区分的优势特征，从而降低遮挡和姿势变化对表情识别的影响。同时，受人脸识别中使用的大余弦余量概念的启发，提出了一种中心余弦损失函数，以避免在表情识别任务中由于潜在的类间相似性和大量类内特征变化造成的误分类。该函数旨在通过使同一类别内的样本分布更紧凑，不同类别间的样本分布更稀疏来提高网络的分类性能。所提出的方法在三个主流野生数据集和两个呈现真实遮挡和变异挑战的数据集上与几种先进的基线模型进行了基准测试。在 RAF-DB、AffectNet 和 FERPlus 上的准确率分别达到了 89.63%、61.82% 和 91.15%，这表明与现实世界中最先进的替代方法相比，该方法具有更强的鲁棒性和可靠性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A discriminative multiscale feature extraction network for facial expression recognition in the wild

Driven by advancements in deep learning technologies, substantial progress has been achieved in the field of facial expression recognition over the past decade, while challenges remain brought about by occlusions, pose variations and subtle expression differences in unconstrained (wild) scenarios. Therefore, a novel multiscale feature extraction method is proposed in this paper, that leverages convolutional neural networks to simultaneously extract deep semantic features and shallow geometric features. Through the mechanism of channel-wise self-attention, prominent features are further extracted and compressed, preserving advantageous features for distinction and thereby reducing the impact of occlusions and pose variations on expression recognition. Meanwhile, inspired by the large cosine margin concept used in face recognition, a center cosine loss function is proposed to avoid the misclassification caused by the underlying interclass similarity and substantial intra-class feature variations in the task of expression recognition. This function is designed to enhance the classification performance of the network through making the distribution of samples within the same class more compact and that between different classes sparser. The proposed method is benchmarked against several advanced baseline models on three mainstream wild datasets and two datasets that present realistic occlusion and pose variation challenges. Accuracies of 89.63%, 61.82%, and 91.15% are achieved on RAF-DB, AffectNet and FERPlus, respectively, demonstrating the greater robustness and reliability of this method compared to the state-of-the-art alternatives in the real world.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Measurement Science and Technology 工程技术-工程：综合

CiteScore

4.30

自引率

16.70%

发文量

656

审稿时长

4.9 months

期刊介绍： Measurement Science and Technology publishes articles on new measurement techniques and associated instrumentation. Papers that describe experiments must represent an advance in measurement science or measurement technique rather than the application of established experimental technique. Bearing in mind the multidisciplinary nature of the journal, authors must provide an introduction to their work that makes clear the novelty, significance, broader relevance of their work in a measurement context and relevance to the readership of Measurement Science and Technology. All submitted articles should contain consideration of the uncertainty, precision and/or accuracy of the measurements presented. Subject coverage includes the theory, practice and application of measurement in physics, chemistry, engineering and the environmental and life sciences from inception to commercial exploitation. Publications in the journal should emphasize the novelty of reported methods, characterize them and demonstrate their performance using examples or applications.