Distillation-Guided Representation Learning for Unconstrained Video Human Authentication

IF 5

IEEE transactions on biometrics, behavior, and identity science Pub Date : 2025-08-04 DOI:10.1109/TBIOM.2025.3595366

Yuxiang Guo;Siyuan Huang;Ram Prabhakar Kathirvel;Chun Pong Lau;Rama Chellappa;Cheng Peng

{"title":"Distillation-Guided Representation Learning for Unconstrained Video Human Authentication","authors":"Yuxiang Guo;Siyuan Huang;Ram Prabhakar Kathirvel;Chun Pong Lau;Rama Chellappa;Cheng Peng","doi":"10.1109/TBIOM.2025.3595366","DOIUrl":null,"url":null,"abstract":"Human authentication is an important and challenging biometric task, particularly from unconstrained videos. While body recognition is a popular approach, gait recognition holds the promise of robustly identifying subjects based on walking patterns instead of appearance information. Previous gait-based approaches have performed well for curated indoor scenes; however, they tend to underperform in unconstrained situations. To address these challenges, we propose a framework, termed Holistic GAit DEtection and Recognition (H-GADER), for human authentication in challenging outdoor scenarios. Specifically, H-GADER leverages a Double Helical Signature to detect segments that contain human movement and builds discriminative features through a novel gait recognition method. To further enhance robustness, H-GADER encodes viewpoint information in its architecture, and distills learned representations from an auxiliary RGB recognition model; this allows H-GADER to learn from maximum amount of data at training time. At test time, H-GADER infers solely from the silhouette modality. Furthermore, we introduce a body recognition model through semantic, large-scale, self-supervised training to complement gait recognition. By conditionally fusing gait and body representations based on the presence/absence of gait information as decided by the gait detection, we demonstrate significant improvements compared to when a single modality or a naive feature ensemble is used. We evaluate our method on multiple existing State-of-The-Arts (SoTA) gait baselines and demonstrate consistent improvements on indoor and outdoor datasets, especially on the BRIAR dataset, which features unconstrained, long-distance videos, achieving a 28.9% improvement.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 4","pages":"940-952"},"PeriodicalIF":5.0000,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11111687","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11111687/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Human authentication is an important and challenging biometric task, particularly from unconstrained videos. While body recognition is a popular approach, gait recognition holds the promise of robustly identifying subjects based on walking patterns instead of appearance information. Previous gait-based approaches have performed well for curated indoor scenes; however, they tend to underperform in unconstrained situations. To address these challenges, we propose a framework, termed Holistic GAit DEtection and Recognition (H-GADER), for human authentication in challenging outdoor scenarios. Specifically, H-GADER leverages a Double Helical Signature to detect segments that contain human movement and builds discriminative features through a novel gait recognition method. To further enhance robustness, H-GADER encodes viewpoint information in its architecture, and distills learned representations from an auxiliary RGB recognition model; this allows H-GADER to learn from maximum amount of data at training time. At test time, H-GADER infers solely from the silhouette modality. Furthermore, we introduce a body recognition model through semantic, large-scale, self-supervised training to complement gait recognition. By conditionally fusing gait and body representations based on the presence/absence of gait information as decided by the gait detection, we demonstrate significant improvements compared to when a single modality or a naive feature ensemble is used. We evaluate our method on multiple existing State-of-The-Arts (SoTA) gait baselines and demonstrate consistent improvements on indoor and outdoor datasets, especially on the BRIAR dataset, which features unconstrained, long-distance videos, achieving a 28.9% improvement.

查看原文本刊更多论文

无约束视频人体认证的蒸馏引导表示学习

人体身份验证是一项重要且具有挑战性的生物识别任务，特别是来自无约束视频的身份验证。虽然身体识别是一种流行的方法，但步态识别有望根据行走模式而不是外观信息来可靠地识别受试者。之前基于步态的方法在精心策划的室内场景中表现良好；然而，在不受约束的情况下，他们往往表现不佳。为了解决这些挑战，我们提出了一个框架，称为整体步态检测和识别（H-GADER），用于在具有挑战性的户外场景中进行人类身份验证。具体来说，H-GADER利用双螺旋特征来检测包含人体运动的片段，并通过一种新的步态识别方法建立鉴别特征。为了进一步增强鲁棒性，H-GADER在其架构中编码视点信息，并从辅助RGB识别模型中提取学习表征；这允许H-GADER在训练时从最大数量的数据中学习。在测试时，H-GADER仅从轮廓模态进行推断。此外，我们引入了一个通过语义、大规模、自监督训练的身体识别模型来补充步态识别。通过根据步态检测决定的步态信息是否存在，有条件地融合步态和身体表征，与使用单一模态或朴素特征集成相比，我们证明了显著的改进。我们在多个现有的SoTA步态基线上评估了我们的方法，并在室内和室外数据集上展示了一致的改进，特别是在BRIAR数据集上，该数据集具有无约束的长距离视频，实现了28.9%的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on biometrics, behavior, and identity science

CiteScore

10.90

自引率

0.00%

发文量