{"title":"Explaining spatio-temporal graph convolutional networks with spatio-temporal constraints perturbation for action recognition","authors":"Rui Yu, Yanshan Li, Ting Shi, Weixin Xie","doi":"10.1016/j.inffus.2025.103387","DOIUrl":null,"url":null,"abstract":"<div><div>Understanding and explaining spatio-temporal graph convolutional networks (STGCNs) for human skeleton action recognition is crucial to improving the security and trustworthiness of action recognition algorithms. However, the complexity of geometric spatio-temporal features in skeleton-based spatio-temporal graphs, the high dependence of geometric features on temporal information, and the dynamics of STGCNs challenge existing graph neural networks (GNNs) explanation methods. It is thorny for these methods to explain the geometric spatio-temporal features intertwined in STGCNs. To this end, we take the human skeleton action recognition based on STGCNs as the research object, and propose a spatio-temporal constraints explanation method for STGCNs (STGExplainer). Firstly, we construct a geometric transition model of human motion in spatio-temporal graphs, which utilizes the Rodrigues’ rotation equation to describe the relationship among nodes in geometric space over time. This model is adopted to geometrically perturb STGCNs. Then, in order to evaluate the importance of geometric spatio-temporal features in STGCNs, we design a geometric perturbation module based on spatio-temporal constraints. The module includes a spatio-temporal constraints-based objective function and an optimization algorithm based on the gradient alternating direction method of multipliers (ADMM), which identifies the geometric spatio-temporal features that are more critical to the model after geometric perturbation of spatio-temporal constraints on STGCNs. The proposed objective function makes the explanation results highly sparse and consistent with the input in terms of geometric consistency of the spatio-temporal structure. Finally, to solve the spatio-temporal importance optimization problem with spatio-temporal constraints, an optimization algorithm based on the gradient ADMM is designed. The algorithm decomposes the optimized spatio-temporal and geometric spatial importance distributions, and then gradually generates more accurate geometric spatio-temporal feature explanations. Experimental results on real datasets show that STGExplainer achieves excellent performance.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103387"},"PeriodicalIF":14.7000,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525004609","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Understanding and explaining spatio-temporal graph convolutional networks (STGCNs) for human skeleton action recognition is crucial to improving the security and trustworthiness of action recognition algorithms. However, the complexity of geometric spatio-temporal features in skeleton-based spatio-temporal graphs, the high dependence of geometric features on temporal information, and the dynamics of STGCNs challenge existing graph neural networks (GNNs) explanation methods. It is thorny for these methods to explain the geometric spatio-temporal features intertwined in STGCNs. To this end, we take the human skeleton action recognition based on STGCNs as the research object, and propose a spatio-temporal constraints explanation method for STGCNs (STGExplainer). Firstly, we construct a geometric transition model of human motion in spatio-temporal graphs, which utilizes the Rodrigues’ rotation equation to describe the relationship among nodes in geometric space over time. This model is adopted to geometrically perturb STGCNs. Then, in order to evaluate the importance of geometric spatio-temporal features in STGCNs, we design a geometric perturbation module based on spatio-temporal constraints. The module includes a spatio-temporal constraints-based objective function and an optimization algorithm based on the gradient alternating direction method of multipliers (ADMM), which identifies the geometric spatio-temporal features that are more critical to the model after geometric perturbation of spatio-temporal constraints on STGCNs. The proposed objective function makes the explanation results highly sparse and consistent with the input in terms of geometric consistency of the spatio-temporal structure. Finally, to solve the spatio-temporal importance optimization problem with spatio-temporal constraints, an optimization algorithm based on the gradient ADMM is designed. The algorithm decomposes the optimized spatio-temporal and geometric spatial importance distributions, and then gradually generates more accurate geometric spatio-temporal feature explanations. Experimental results on real datasets show that STGExplainer achieves excellent performance.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.