SPD-Net: A semantic partitioned transformer with dynamic graph network for improved skeleton-based gait recognition

IF 6.3 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2026-07-01 Epub Date: 2026-02-03 DOI:10.1016/j.neunet.2026.108679

Priyanka D, Mala T

{"title":"SPD-Net: A semantic partitioned transformer with dynamic graph network for improved skeleton-based gait recognition","authors":"Priyanka D, Mala T","doi":"10.1016/j.neunet.2026.108679","DOIUrl":null,"url":null,"abstract":"<div><div>Gait recognition has gained prominence as a biometric modality owing to its unobtrusive and non-invasive nature. Existing methods primarily rely on silhouette-based representations, making them sensitive to variations in clothing, occlusion, and background noise. In contrast, model-based approaches utilize skeleton sequences to capture motion dynamics through joint connectivity, thereby reducing dependence on visual appearance. However, these approaches often rely on physically connected joints, limiting their ability to model semantically meaningful joint relationships. Transformer-based models mitigate this limitation by capturing long-range dependencies, but at the expense of substantial computational overhead. To address these challenges, this work proposes the Semantic Partitioned transformer with Dynamic Graph Network (SPD-Net) for robust gait recognition. SPD-Net integrates Dynamic Graph Convolutional Network (DGCN), Temporal Convolutional Network (TCN), and Semantic Partitioned Multi-head Self-Attention (SP-MSA) to enhance the representation of gait features. DGCN dynamically learns spatial correlations between joints, while TCN captures temporal dependencies. Furthermore, SP-MSA introduces a semantic partitioning strategy that selectively focuses on key joints and frames, significantly reducing computational complexity while preserving crucial gait patterns. This approach effectively models both physically neighboring and distant joint relationships, along with intra- and inter-frame correlations. Finally, a Joint-Part Mapping (JPM) module enhances the discriminative power of gait representations by capturing hierarchical joint relationships across multiple scales. Experimental evaluations on benchmark gait datasets show that SPD-Net surpasses prior state-of-the-art approaches, achieving improved robustness and accuracy across diverse gait recognition challenges.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108679"},"PeriodicalIF":6.3000,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608026001413","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/2/3 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Gait recognition has gained prominence as a biometric modality owing to its unobtrusive and non-invasive nature. Existing methods primarily rely on silhouette-based representations, making them sensitive to variations in clothing, occlusion, and background noise. In contrast, model-based approaches utilize skeleton sequences to capture motion dynamics through joint connectivity, thereby reducing dependence on visual appearance. However, these approaches often rely on physically connected joints, limiting their ability to model semantically meaningful joint relationships. Transformer-based models mitigate this limitation by capturing long-range dependencies, but at the expense of substantial computational overhead. To address these challenges, this work proposes the Semantic Partitioned transformer with Dynamic Graph Network (SPD-Net) for robust gait recognition. SPD-Net integrates Dynamic Graph Convolutional Network (DGCN), Temporal Convolutional Network (TCN), and Semantic Partitioned Multi-head Self-Attention (SP-MSA) to enhance the representation of gait features. DGCN dynamically learns spatial correlations between joints, while TCN captures temporal dependencies. Furthermore, SP-MSA introduces a semantic partitioning strategy that selectively focuses on key joints and frames, significantly reducing computational complexity while preserving crucial gait patterns. This approach effectively models both physically neighboring and distant joint relationships, along with intra- and inter-frame correlations. Finally, a Joint-Part Mapping (JPM) module enhances the discriminative power of gait representations by capturing hierarchical joint relationships across multiple scales. Experimental evaluations on benchmark gait datasets show that SPD-Net surpasses prior state-of-the-art approaches, achieving improved robustness and accuracy across diverse gait recognition challenges.

查看原文本刊更多论文

SPD-Net：一种基于语义划分的动态图网络转换器，用于改进的基于骨骼的步态识别

步态识别由于其不显眼和非侵入性而成为一种突出的生物识别方式。现有的方法主要依赖于基于轮廓的表示，这使得它们对服装、遮挡和背景噪声的变化很敏感。相比之下，基于模型的方法利用骨骼序列通过关节连接来捕获运动动力学，从而减少了对视觉外观的依赖。然而，这些方法通常依赖于物理连接的关节，限制了它们对语义上有意义的关节关系建模的能力。基于转换器的模型通过捕获远程依赖关系减轻了这一限制，但代价是大量的计算开销。为了解决这些挑战，本研究提出了基于动态图网络（SPD-Net）的语义分割转换器，用于鲁棒步态识别。SPD-Net集成了动态图卷积网络（DGCN）、时间卷积网络（TCN）和语义分割多头自注意（SP-MSA）来增强步态特征的表征。DGCN动态学习关节之间的空间相关性，而TCN捕获时间依赖性。此外，SP-MSA引入了一种语义划分策略，选择性地关注关键关节和框架，在保留关键步态模式的同时显著降低了计算复杂度。这种方法有效地模拟了物理上相邻和远处的关节关系，以及帧内和帧间的相关性。最后，关节部分映射（JPM）模块通过捕获跨多个尺度的分层关节关系来增强步态表征的判别能力。对基准步态数据集的实验评估表明，SPD-Net超越了先前最先进的方法，在各种步态识别挑战中实现了更高的鲁棒性和准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.