Faisal Mehmood , Xin Guo , Enqing Chen , Muhammad Azeem Akbar , Arif Ali Khan , Sami Ullah
{"title":"Extended multi-stream temporal-attention module for skeleton-based human action recognition (HAR)","authors":"Faisal Mehmood , Xin Guo , Enqing Chen , Muhammad Azeem Akbar , Arif Ali Khan , Sami Ullah","doi":"10.1016/j.chb.2024.108482","DOIUrl":null,"url":null,"abstract":"<div><div>Graph convolutional networks (GCNs) are an effective skeleton-based human action recognition (HAR) technique. GCNs enable the specification of CNNs to a non-Euclidean frame that is more flexible. The previous GCN-based models still have a lot of issues: (I) The graph structure is the same for all model layers and input data. GCN model's hierarchical structure and human action recognition input diversity make this a problematic approach; (II) Bone length and orientation are understudied due to their significance and variance in HAR. For this purpose, we introduce an Extended Multi-stream Temporal-attention Adaptive GCN (EMS-TAGCN). By training the network topology of the proposed model either consistently or independently according to the input data, this data-based technique makes graphs more flexible and faster to adapt to a new dataset. A spatial, temporal, and channel attention module helps the adaptive graph convolutional layer focus on joints, frames, and features. Hence, a multi-stream framework representing bones, joints, and their motion enhances recognition accuracy. Our proposed model outperforms the NTU RGBD for CS and CV by 0.6% and 1.4%, respectively, while Kinetics-skeleton Top-1 and Top-5 are 1.4% improved, UCF-101 has improved 2.34% accuracy and HMDB-51 dataset has significantly improved 1.8% accuracy. According to the results, our model has performed better than the other models. Our model consistently outperformed other models, and the results were statistically significant that demonstrating the superiority of our model for the task of HAR and its ability to provide the most reliable and accurate results.</div></div>","PeriodicalId":9,"journal":{"name":"ACS Catalysis ","volume":null,"pages":null},"PeriodicalIF":11.3000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Catalysis ","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0747563224003509","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Graph convolutional networks (GCNs) are an effective skeleton-based human action recognition (HAR) technique. GCNs enable the specification of CNNs to a non-Euclidean frame that is more flexible. The previous GCN-based models still have a lot of issues: (I) The graph structure is the same for all model layers and input data. GCN model's hierarchical structure and human action recognition input diversity make this a problematic approach; (II) Bone length and orientation are understudied due to their significance and variance in HAR. For this purpose, we introduce an Extended Multi-stream Temporal-attention Adaptive GCN (EMS-TAGCN). By training the network topology of the proposed model either consistently or independently according to the input data, this data-based technique makes graphs more flexible and faster to adapt to a new dataset. A spatial, temporal, and channel attention module helps the adaptive graph convolutional layer focus on joints, frames, and features. Hence, a multi-stream framework representing bones, joints, and their motion enhances recognition accuracy. Our proposed model outperforms the NTU RGBD for CS and CV by 0.6% and 1.4%, respectively, while Kinetics-skeleton Top-1 and Top-5 are 1.4% improved, UCF-101 has improved 2.34% accuracy and HMDB-51 dataset has significantly improved 1.8% accuracy. According to the results, our model has performed better than the other models. Our model consistently outperformed other models, and the results were statistically significant that demonstrating the superiority of our model for the task of HAR and its ability to provide the most reliable and accurate results.
期刊介绍:
ACS Catalysis is an esteemed journal that publishes original research in the fields of heterogeneous catalysis, molecular catalysis, and biocatalysis. It offers broad coverage across diverse areas such as life sciences, organometallics and synthesis, photochemistry and electrochemistry, drug discovery and synthesis, materials science, environmental protection, polymer discovery and synthesis, and energy and fuels.
The scope of the journal is to showcase innovative work in various aspects of catalysis. This includes new reactions and novel synthetic approaches utilizing known catalysts, the discovery or modification of new catalysts, elucidation of catalytic mechanisms through cutting-edge investigations, practical enhancements of existing processes, as well as conceptual advances in the field. Contributions to ACS Catalysis can encompass both experimental and theoretical research focused on catalytic molecules, macromolecules, and materials that exhibit catalytic turnover.