Video Face Recognition Using Neural Aggregation Networks with Mutual Relational Learning

2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI) Pub Date : 2022-10-01 DOI:10.1109/ICTAI56018.2022.00104

Kangli Zeng, Zhongyuan Wang, Tao Lu, Jianyu Chen

{"title":"Video Face Recognition Using Neural Aggregation Networks with Mutual Relational Learning","authors":"Kangli Zeng, Zhongyuan Wang, Tao Lu, Jianyu Chen","doi":"10.1109/ICTAI56018.2022.00104","DOIUrl":null,"url":null,"abstract":"Video face recognition benefits profoundly from deep convolutional neural networks (CNNs), which learn robust feature embeddings. However, due to their fixed geometric structures, CNNs are inherently limited in modeling the significant variations from the angle, pose, occlusion and other factors of face images. In this paper, a neural aggregation network based on mutual relation learning is proposed for video face recognition. First, Intra-frame Relational Learning network (Intra-Net) is introduced, which models the interdependencies between the re-gional components of individual features and develops relevance between fine-grained features. Such processing can determine the region of interest adaptively according to the quality of the input face image to achieve the extraction of valuable information. Secondly, we introduce Inter-frame Relational Learning Network (Inter-Net), which considers the most significant appearance representation in the overall structure of the face image to cor-relate the complementarity of features between frames. Finally, information aggregation is performed by combining Inter-Net and Intra-Net. Joint optimization of the two branches allows our model to effectively exploit the complementary information between them to improve the aggregation capability. We validate the effectiveness of our model for video face recognition, proving its superiority over state-of-the-art methods on two benchmark datasets.","PeriodicalId":354314,"journal":{"name":"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI56018.2022.00104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Video face recognition benefits profoundly from deep convolutional neural networks (CNNs), which learn robust feature embeddings. However, due to their fixed geometric structures, CNNs are inherently limited in modeling the significant variations from the angle, pose, occlusion and other factors of face images. In this paper, a neural aggregation network based on mutual relation learning is proposed for video face recognition. First, Intra-frame Relational Learning network (Intra-Net) is introduced, which models the interdependencies between the re-gional components of individual features and develops relevance between fine-grained features. Such processing can determine the region of interest adaptively according to the quality of the input face image to achieve the extraction of valuable information. Secondly, we introduce Inter-frame Relational Learning Network (Inter-Net), which considers the most significant appearance representation in the overall structure of the face image to cor-relate the complementarity of features between frames. Finally, information aggregation is performed by combining Inter-Net and Intra-Net. Joint optimization of the two branches allows our model to effectively exploit the complementary information between them to improve the aggregation capability. We validate the effectiveness of our model for video face recognition, proving its superiority over state-of-the-art methods on two benchmark datasets.

查看原文本刊更多论文

基于相互关系学习的神经聚合网络的视频人脸识别

视频人脸识别从深度卷积神经网络(cnn)中获益良多，cnn学习鲁棒特征嵌入。然而，由于其固定的几何结构，cnn在模拟人脸图像的角度、姿态、遮挡等因素的显著变化时存在固有的局限性。本文提出了一种基于相互关系学习的神经聚合网络用于视频人脸识别。首先，介绍了框架内关系学习网络(Intra-frame Relational Learning network, Intra-Net)，该网络对单个特征的区域成分之间的相互依赖关系进行建模，并开发细粒度特征之间的相关性。这种处理可以根据输入人脸图像的质量自适应地确定感兴趣的区域，从而实现有价值信息的提取。其次，引入帧间关系学习网络(Inter-frame Relational Learning Network, Inter-Net)，该网络考虑人脸图像整体结构中最重要的外观表征来关联帧间特征的互补性。最后，结合Inter-Net和Intra-Net进行信息聚合。两个分支的联合优化使我们的模型能够有效地利用它们之间的互补信息，提高聚合能力。我们验证了我们的模型在视频人脸识别中的有效性，在两个基准数据集上证明了它比最先进的方法的优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)

自引率

0.00%

发文量