{"title":"Video Face Recognition Using Neural Aggregation Networks with Mutual Relational Learning","authors":"Kangli Zeng, Zhongyuan Wang, Tao Lu, Jianyu Chen","doi":"10.1109/ICTAI56018.2022.00104","DOIUrl":null,"url":null,"abstract":"Video face recognition benefits profoundly from deep convolutional neural networks (CNNs), which learn robust feature embeddings. However, due to their fixed geometric structures, CNNs are inherently limited in modeling the significant variations from the angle, pose, occlusion and other factors of face images. In this paper, a neural aggregation network based on mutual relation learning is proposed for video face recognition. First, Intra-frame Relational Learning network (Intra-Net) is introduced, which models the interdependencies between the re-gional components of individual features and develops relevance between fine-grained features. Such processing can determine the region of interest adaptively according to the quality of the input face image to achieve the extraction of valuable information. Secondly, we introduce Inter-frame Relational Learning Network (Inter-Net), which considers the most significant appearance representation in the overall structure of the face image to cor-relate the complementarity of features between frames. Finally, information aggregation is performed by combining Inter-Net and Intra-Net. Joint optimization of the two branches allows our model to effectively exploit the complementary information between them to improve the aggregation capability. We validate the effectiveness of our model for video face recognition, proving its superiority over state-of-the-art methods on two benchmark datasets.","PeriodicalId":354314,"journal":{"name":"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI56018.2022.00104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Video face recognition benefits profoundly from deep convolutional neural networks (CNNs), which learn robust feature embeddings. However, due to their fixed geometric structures, CNNs are inherently limited in modeling the significant variations from the angle, pose, occlusion and other factors of face images. In this paper, a neural aggregation network based on mutual relation learning is proposed for video face recognition. First, Intra-frame Relational Learning network (Intra-Net) is introduced, which models the interdependencies between the re-gional components of individual features and develops relevance between fine-grained features. Such processing can determine the region of interest adaptively according to the quality of the input face image to achieve the extraction of valuable information. Secondly, we introduce Inter-frame Relational Learning Network (Inter-Net), which considers the most significant appearance representation in the overall structure of the face image to cor-relate the complementarity of features between frames. Finally, information aggregation is performed by combining Inter-Net and Intra-Net. Joint optimization of the two branches allows our model to effectively exploit the complementary information between them to improve the aggregation capability. We validate the effectiveness of our model for video face recognition, proving its superiority over state-of-the-art methods on two benchmark datasets.