{"title":"STG-KNet: A Kernel-mapping-based spatial-temporal graph convolution network for pedestrian trajectory prediction","authors":"Yuanzi Xu, Jiafu Yang, Rongjun Cheng","doi":"10.1016/j.physa.2025.130985","DOIUrl":null,"url":null,"abstract":"<div><div>Predicting pedestrian trajectories in complex, dynamic, and crowded environments remains a critical challenge for autonomous driving and human-robot interaction. A pervasive challenge among existing methods is their dependence on rigid graph architectures, which hinders their capacity to model the evolving patterns of pedestrian interaction and obscures the potential features of agent-to-agent relationships. Besides, spatial and time-dependent modeling in most model is mixed, and there is a lack of structural decoupling. These issues result in fragmented reasoning and degraded performance in dense pedestrian scenarios. To address these challenges, we propose STG-KNet, a unified spatiotemporal learning framework combining sparse graph convolution with kernel-based structure modeling. STG-KNet features a dual-branch spatiotemporal encoder to decouple and independently model spatial interactions and temporal motion patterns, enhanced by biologically inspired masking strategies. It further introduces a novel Graph Convolutional Kernel Mapping (GCKM) module to convert discrete graph structures into continuous Gaussian similarity matrices, enabling adaptive edge learning and interpretable feature propagation. A Temporal Convolutional Network (TCN) decoder predicts parameters of 2D Gaussian distributions for future positions, supporting multimodal sampling. Comprehensive experiments on the ETH-UCY dataset demonstrate that STG-KNet achieves state-of-the-art accuracy (ADE=0.23, FDE=0.45), outperforming existing models while maintaining structural interpretability and high computational efficiency. In particular, the model shows exceptional generalization in dense and heterogeneous scenes, confirming the effectiveness of sparse kernel-enhanced graph reasoning in trajectory prediction.</div></div>","PeriodicalId":20152,"journal":{"name":"Physica A: Statistical Mechanics and its Applications","volume":"678 ","pages":"Article 130985"},"PeriodicalIF":3.1000,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physica A: Statistical Mechanics and its Applications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378437125006375","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Predicting pedestrian trajectories in complex, dynamic, and crowded environments remains a critical challenge for autonomous driving and human-robot interaction. A pervasive challenge among existing methods is their dependence on rigid graph architectures, which hinders their capacity to model the evolving patterns of pedestrian interaction and obscures the potential features of agent-to-agent relationships. Besides, spatial and time-dependent modeling in most model is mixed, and there is a lack of structural decoupling. These issues result in fragmented reasoning and degraded performance in dense pedestrian scenarios. To address these challenges, we propose STG-KNet, a unified spatiotemporal learning framework combining sparse graph convolution with kernel-based structure modeling. STG-KNet features a dual-branch spatiotemporal encoder to decouple and independently model spatial interactions and temporal motion patterns, enhanced by biologically inspired masking strategies. It further introduces a novel Graph Convolutional Kernel Mapping (GCKM) module to convert discrete graph structures into continuous Gaussian similarity matrices, enabling adaptive edge learning and interpretable feature propagation. A Temporal Convolutional Network (TCN) decoder predicts parameters of 2D Gaussian distributions for future positions, supporting multimodal sampling. Comprehensive experiments on the ETH-UCY dataset demonstrate that STG-KNet achieves state-of-the-art accuracy (ADE=0.23, FDE=0.45), outperforming existing models while maintaining structural interpretability and high computational efficiency. In particular, the model shows exceptional generalization in dense and heterogeneous scenes, confirming the effectiveness of sparse kernel-enhanced graph reasoning in trajectory prediction.
期刊介绍:
Physica A: Statistical Mechanics and its Applications
Recognized by the European Physical Society
Physica A publishes research in the field of statistical mechanics and its applications.
Statistical mechanics sets out to explain the behaviour of macroscopic systems by studying the statistical properties of their microscopic constituents.
Applications of the techniques of statistical mechanics are widespread, and include: applications to physical systems such as solids, liquids and gases; applications to chemical and biological systems (colloids, interfaces, complex fluids, polymers and biopolymers, cell physics); and other interdisciplinary applications to for instance biological, economical and sociological systems.