Wei Nie;Zhiyong Wang;Xinming Wang;Bowen Chen;Hanlin Zhang;Honghai Liu
{"title":"Diving Into Sample Selection for Facial Expression Recognition With Noisy Annotations","authors":"Wei Nie;Zhiyong Wang;Xinming Wang;Bowen Chen;Hanlin Zhang;Honghai Liu","doi":"10.1109/TBIOM.2024.3435498","DOIUrl":null,"url":null,"abstract":"Real-world Facial Expression Recognition (FER) suffers from noisy labels due to ambiguous expressions and subjective annotation. Overall, addressing noisy label FER involves two core issues: the efficient utilization of clean samples and the effective utilization of noisy samples. However, existing methods demonstrate their effectiveness solely through the generalization improvement by using all corrupted data, making it difficult to ascertain whether the observed improvement genuinely addresses these two issues. To decouple this dilemma, this paper focuses on efficiently utilizing clean samples by diving into sample selection. Specifically, we enhance the classical noisy label learning method Co-divide with two straightforward modifications, introducing a noisy label discriminator more suitable for FER termed IntraClass-divide. Firstly, IntraClass-divide constructs a class-separate two-component Gaussian Mixture Model (GMM) for each category instead of a shared GMM for all categories. Secondly, IntraClass-divide simplifies the framework by eliminating the dual-network training scheme. In addition to achieving the leading sample selection performance of nearly 95% Micro-F1 in standard synthetic noise paradigm, we first propose a natural noise paradigm and also achieve a leading sample selection performance of 82.63% Micro-F1. Moreover, we train a ResNet18 with the clean samples identified by IntraClass-divide yields better generalization performance than previous sophisticated noisy label FER models trained on all corrupted data.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 1","pages":"95-107"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10614222/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Real-world Facial Expression Recognition (FER) suffers from noisy labels due to ambiguous expressions and subjective annotation. Overall, addressing noisy label FER involves two core issues: the efficient utilization of clean samples and the effective utilization of noisy samples. However, existing methods demonstrate their effectiveness solely through the generalization improvement by using all corrupted data, making it difficult to ascertain whether the observed improvement genuinely addresses these two issues. To decouple this dilemma, this paper focuses on efficiently utilizing clean samples by diving into sample selection. Specifically, we enhance the classical noisy label learning method Co-divide with two straightforward modifications, introducing a noisy label discriminator more suitable for FER termed IntraClass-divide. Firstly, IntraClass-divide constructs a class-separate two-component Gaussian Mixture Model (GMM) for each category instead of a shared GMM for all categories. Secondly, IntraClass-divide simplifies the framework by eliminating the dual-network training scheme. In addition to achieving the leading sample selection performance of nearly 95% Micro-F1 in standard synthetic noise paradigm, we first propose a natural noise paradigm and also achieve a leading sample selection performance of 82.63% Micro-F1. Moreover, we train a ResNet18 with the clean samples identified by IntraClass-divide yields better generalization performance than previous sophisticated noisy label FER models trained on all corrupted data.