{"title":"用于野生面部表情识别的判别性多尺度特征提取网络","authors":"Xiaoyu Wen, Juxiang Zhou, Jianhou Gan, Sen Luo","doi":"10.1088/1361-6501/ad191c","DOIUrl":null,"url":null,"abstract":"Driven by advancements in deep learning technologies, substantial progress has been achieved in the field of facial expression recognition over the past decade, while challenges remain brought about by occlusions, pose variations and subtle expression differences in unconstrained (wild) scenarios. Therefore, a novel multiscale feature extraction method is proposed in this paper, that leverages convolutional neural networks to simultaneously extract deep semantic features and shallow geometric features. Through the mechanism of channel-wise self-attention, prominent features are further extracted and compressed, preserving advantageous features for distinction and thereby reducing the impact of occlusions and pose variations on expression recognition. Meanwhile, inspired by the large cosine margin concept used in face recognition, a center cosine loss function is proposed to avoid the misclassification caused by the underlying interclass similarity and substantial intra-class feature variations in the task of expression recognition. This function is designed to enhance the classification performance of the network through making the distribution of samples within the same class more compact and that between different classes sparser. The proposed method is benchmarked against several advanced baseline models on three mainstream wild datasets and two datasets that present realistic occlusion and pose variation challenges. Accuracies of 89.63%, 61.82%, and 91.15% are achieved on RAF-DB, AffectNet and FERPlus, respectively, demonstrating the greater robustness and reliability of this method compared to the state-of-the-art alternatives in the real world.","PeriodicalId":18526,"journal":{"name":"Measurement Science and Technology","volume":"42 11","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A discriminative multiscale feature extraction network for facial expression recognition in the wild\",\"authors\":\"Xiaoyu Wen, Juxiang Zhou, Jianhou Gan, Sen Luo\",\"doi\":\"10.1088/1361-6501/ad191c\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Driven by advancements in deep learning technologies, substantial progress has been achieved in the field of facial expression recognition over the past decade, while challenges remain brought about by occlusions, pose variations and subtle expression differences in unconstrained (wild) scenarios. Therefore, a novel multiscale feature extraction method is proposed in this paper, that leverages convolutional neural networks to simultaneously extract deep semantic features and shallow geometric features. Through the mechanism of channel-wise self-attention, prominent features are further extracted and compressed, preserving advantageous features for distinction and thereby reducing the impact of occlusions and pose variations on expression recognition. Meanwhile, inspired by the large cosine margin concept used in face recognition, a center cosine loss function is proposed to avoid the misclassification caused by the underlying interclass similarity and substantial intra-class feature variations in the task of expression recognition. This function is designed to enhance the classification performance of the network through making the distribution of samples within the same class more compact and that between different classes sparser. The proposed method is benchmarked against several advanced baseline models on three mainstream wild datasets and two datasets that present realistic occlusion and pose variation challenges. Accuracies of 89.63%, 61.82%, and 91.15% are achieved on RAF-DB, AffectNet and FERPlus, respectively, demonstrating the greater robustness and reliability of this method compared to the state-of-the-art alternatives in the real world.\",\"PeriodicalId\":18526,\"journal\":{\"name\":\"Measurement Science and Technology\",\"volume\":\"42 11\",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-01-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Measurement Science and Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1088/1361-6501/ad191c\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Measurement Science and Technology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1088/1361-6501/ad191c","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
A discriminative multiscale feature extraction network for facial expression recognition in the wild
Driven by advancements in deep learning technologies, substantial progress has been achieved in the field of facial expression recognition over the past decade, while challenges remain brought about by occlusions, pose variations and subtle expression differences in unconstrained (wild) scenarios. Therefore, a novel multiscale feature extraction method is proposed in this paper, that leverages convolutional neural networks to simultaneously extract deep semantic features and shallow geometric features. Through the mechanism of channel-wise self-attention, prominent features are further extracted and compressed, preserving advantageous features for distinction and thereby reducing the impact of occlusions and pose variations on expression recognition. Meanwhile, inspired by the large cosine margin concept used in face recognition, a center cosine loss function is proposed to avoid the misclassification caused by the underlying interclass similarity and substantial intra-class feature variations in the task of expression recognition. This function is designed to enhance the classification performance of the network through making the distribution of samples within the same class more compact and that between different classes sparser. The proposed method is benchmarked against several advanced baseline models on three mainstream wild datasets and two datasets that present realistic occlusion and pose variation challenges. Accuracies of 89.63%, 61.82%, and 91.15% are achieved on RAF-DB, AffectNet and FERPlus, respectively, demonstrating the greater robustness and reliability of this method compared to the state-of-the-art alternatives in the real world.
期刊介绍:
Measurement Science and Technology publishes articles on new measurement techniques and associated instrumentation. Papers that describe experiments must represent an advance in measurement science or measurement technique rather than the application of established experimental technique. Bearing in mind the multidisciplinary nature of the journal, authors must provide an introduction to their work that makes clear the novelty, significance, broader relevance of their work in a measurement context and relevance to the readership of Measurement Science and Technology. All submitted articles should contain consideration of the uncertainty, precision and/or accuracy of the measurements presented.
Subject coverage includes the theory, practice and application of measurement in physics, chemistry, engineering and the environmental and life sciences from inception to commercial exploitation. Publications in the journal should emphasize the novelty of reported methods, characterize them and demonstrate their performance using examples or applications.