STG-KNet: A Kernel-mapping-based spatial-temporal graph convolution network for pedestrian trajectory prediction

IF 3.1 3区 物理与天体物理 Q2 PHYSICS, MULTIDISCIPLINARY
Yuanzi Xu, Jiafu Yang, Rongjun Cheng
{"title":"STG-KNet: A Kernel-mapping-based spatial-temporal graph convolution network for pedestrian trajectory prediction","authors":"Yuanzi Xu,&nbsp;Jiafu Yang,&nbsp;Rongjun Cheng","doi":"10.1016/j.physa.2025.130985","DOIUrl":null,"url":null,"abstract":"<div><div>Predicting pedestrian trajectories in complex, dynamic, and crowded environments remains a critical challenge for autonomous driving and human-robot interaction. A pervasive challenge among existing methods is their dependence on rigid graph architectures, which hinders their capacity to model the evolving patterns of pedestrian interaction and obscures the potential features of agent-to-agent relationships. Besides, spatial and time-dependent modeling in most model is mixed, and there is a lack of structural decoupling. These issues result in fragmented reasoning and degraded performance in dense pedestrian scenarios. To address these challenges, we propose STG-KNet, a unified spatiotemporal learning framework combining sparse graph convolution with kernel-based structure modeling. STG-KNet features a dual-branch spatiotemporal encoder to decouple and independently model spatial interactions and temporal motion patterns, enhanced by biologically inspired masking strategies. It further introduces a novel Graph Convolutional Kernel Mapping (GCKM) module to convert discrete graph structures into continuous Gaussian similarity matrices, enabling adaptive edge learning and interpretable feature propagation. A Temporal Convolutional Network (TCN) decoder predicts parameters of 2D Gaussian distributions for future positions, supporting multimodal sampling. Comprehensive experiments on the ETH-UCY dataset demonstrate that STG-KNet achieves state-of-the-art accuracy (ADE=0.23, FDE=0.45), outperforming existing models while maintaining structural interpretability and high computational efficiency. In particular, the model shows exceptional generalization in dense and heterogeneous scenes, confirming the effectiveness of sparse kernel-enhanced graph reasoning in trajectory prediction.</div></div>","PeriodicalId":20152,"journal":{"name":"Physica A: Statistical Mechanics and its Applications","volume":"678 ","pages":"Article 130985"},"PeriodicalIF":3.1000,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physica A: Statistical Mechanics and its Applications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378437125006375","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Predicting pedestrian trajectories in complex, dynamic, and crowded environments remains a critical challenge for autonomous driving and human-robot interaction. A pervasive challenge among existing methods is their dependence on rigid graph architectures, which hinders their capacity to model the evolving patterns of pedestrian interaction and obscures the potential features of agent-to-agent relationships. Besides, spatial and time-dependent modeling in most model is mixed, and there is a lack of structural decoupling. These issues result in fragmented reasoning and degraded performance in dense pedestrian scenarios. To address these challenges, we propose STG-KNet, a unified spatiotemporal learning framework combining sparse graph convolution with kernel-based structure modeling. STG-KNet features a dual-branch spatiotemporal encoder to decouple and independently model spatial interactions and temporal motion patterns, enhanced by biologically inspired masking strategies. It further introduces a novel Graph Convolutional Kernel Mapping (GCKM) module to convert discrete graph structures into continuous Gaussian similarity matrices, enabling adaptive edge learning and interpretable feature propagation. A Temporal Convolutional Network (TCN) decoder predicts parameters of 2D Gaussian distributions for future positions, supporting multimodal sampling. Comprehensive experiments on the ETH-UCY dataset demonstrate that STG-KNet achieves state-of-the-art accuracy (ADE=0.23, FDE=0.45), outperforming existing models while maintaining structural interpretability and high computational efficiency. In particular, the model shows exceptional generalization in dense and heterogeneous scenes, confirming the effectiveness of sparse kernel-enhanced graph reasoning in trajectory prediction.
STG-KNet:基于核映射的时空图卷积网络行人轨迹预测
在复杂、动态和拥挤的环境中预测行人轨迹仍然是自动驾驶和人机交互的关键挑战。现有方法中普遍存在的挑战是它们对刚性图架构的依赖,这阻碍了它们对行人交互演变模式的建模能力,并且模糊了代理对代理关系的潜在特征。此外,大多数模型的时空依赖建模是混合的,缺乏结构解耦。这些问题导致了在密集行人场景下的碎片化推理和性能下降。为了解决这些挑战,我们提出了STG-KNet,这是一个统一的时空学习框架,结合了稀疏图卷积和基于核的结构建模。STG-KNet具有双分支时空编码器,可以解耦并独立模拟空间相互作用和时间运动模式,并通过生物掩蔽策略得到增强。它进一步引入了一种新的图卷积核映射(GCKM)模块,将离散的图结构转换为连续的高斯相似矩阵,实现自适应边缘学习和可解释的特征传播。时序卷积网络(TCN)解码器预测未来位置的二维高斯分布参数,支持多模态采样。在ETH-UCY数据集上的综合实验表明,STG-KNet达到了最先进的精度(ADE=0.23, FDE=0.45),在保持结构可解释性和高计算效率的同时优于现有模型。特别是,该模型在密集和异构场景中表现出出色的泛化能力,证实了稀疏核增强图推理在轨迹预测中的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.20
自引率
9.10%
发文量
852
审稿时长
6.6 months
期刊介绍: Physica A: Statistical Mechanics and its Applications Recognized by the European Physical Society Physica A publishes research in the field of statistical mechanics and its applications. Statistical mechanics sets out to explain the behaviour of macroscopic systems by studying the statistical properties of their microscopic constituents. Applications of the techniques of statistical mechanics are widespread, and include: applications to physical systems such as solids, liquids and gases; applications to chemical and biological systems (colloids, interfaces, complex fluids, polymers and biopolymers, cell physics); and other interdisciplinary applications to for instance biological, economical and sociological systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信