{"title":"利用时空图卷积神经网络和骨架序列增强人类行为识别能力","authors":"Jianmin Xu, Fenglin Liu, Qinghui Wang, Ruirui Zou, Ying Wang, Junling Zheng, Shaoyi Du, Wei Zeng","doi":"10.1186/s13634-024-01156-w","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Objectives</h3><p>This study aims to enhance supervised human activity recognition based on spatiotemporal graph convolutional neural networks by addressing two key challenges: (1) extracting local spatial feature information from implicit joint connections that is unobtainable through standard graph convolutions on natural joint connections alone. (2) Capturing long-range temporal dependencies that extend beyond the limited temporal receptive fields of conventional temporal convolutions.</p><h3 data-test=\"abstract-sub-heading\">Methods</h3><p>To achieve these objectives, we propose three novel modules integrated into the spatiotemporal graph convolutional framework: (1) a connectivity feature extraction module that employs attention to model implicit joint connections and extract their local spatial features. (2) A long-range frame difference feature extraction module that captures extensive temporal context by considering larger frame intervals. (3) A coordinate transformation module that enhances spatial representation by fusing Cartesian and spherical coordinate systems.</p><h3 data-test=\"abstract-sub-heading\">Findings</h3><p>Evaluation across multiple datasets demonstrates that the proposed method achieves significant improvements over baseline networks, with the highest accuracy gains of 2.76<span>\\(\\%\\)</span> on the NTU-RGB+D 60 dataset (Cross-subject), 4.1<span>\\(\\%\\)</span> on NTU-RGB+D 120 (Cross-subject), and 4.3<span>\\(\\%\\)</span> on Kinetics (Top-1), outperforming current state-of-the-art algorithms. This paper delves into the realm of behavior recognition technology, a cornerstone of autonomous systems, and presents a novel approach that enhances the accuracy and precision of human activity recognition.</p>","PeriodicalId":11816,"journal":{"name":"EURASIP Journal on Advances in Signal Processing","volume":"16 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing human behavior recognition with spatiotemporal graph convolutional neural networks and skeleton sequences\",\"authors\":\"Jianmin Xu, Fenglin Liu, Qinghui Wang, Ruirui Zou, Ying Wang, Junling Zheng, Shaoyi Du, Wei Zeng\",\"doi\":\"10.1186/s13634-024-01156-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3 data-test=\\\"abstract-sub-heading\\\">Objectives</h3><p>This study aims to enhance supervised human activity recognition based on spatiotemporal graph convolutional neural networks by addressing two key challenges: (1) extracting local spatial feature information from implicit joint connections that is unobtainable through standard graph convolutions on natural joint connections alone. (2) Capturing long-range temporal dependencies that extend beyond the limited temporal receptive fields of conventional temporal convolutions.</p><h3 data-test=\\\"abstract-sub-heading\\\">Methods</h3><p>To achieve these objectives, we propose three novel modules integrated into the spatiotemporal graph convolutional framework: (1) a connectivity feature extraction module that employs attention to model implicit joint connections and extract their local spatial features. (2) A long-range frame difference feature extraction module that captures extensive temporal context by considering larger frame intervals. (3) A coordinate transformation module that enhances spatial representation by fusing Cartesian and spherical coordinate systems.</p><h3 data-test=\\\"abstract-sub-heading\\\">Findings</h3><p>Evaluation across multiple datasets demonstrates that the proposed method achieves significant improvements over baseline networks, with the highest accuracy gains of 2.76<span>\\\\(\\\\%\\\\)</span> on the NTU-RGB+D 60 dataset (Cross-subject), 4.1<span>\\\\(\\\\%\\\\)</span> on NTU-RGB+D 120 (Cross-subject), and 4.3<span>\\\\(\\\\%\\\\)</span> on Kinetics (Top-1), outperforming current state-of-the-art algorithms. This paper delves into the realm of behavior recognition technology, a cornerstone of autonomous systems, and presents a novel approach that enhances the accuracy and precision of human activity recognition.</p>\",\"PeriodicalId\":11816,\"journal\":{\"name\":\"EURASIP Journal on Advances in Signal Processing\",\"volume\":\"16 1\",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2024-05-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EURASIP Journal on Advances in Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1186/s13634-024-01156-w\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EURASIP Journal on Advances in Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1186/s13634-024-01156-w","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Engineering","Score":null,"Total":0}
Enhancing human behavior recognition with spatiotemporal graph convolutional neural networks and skeleton sequences
Objectives
This study aims to enhance supervised human activity recognition based on spatiotemporal graph convolutional neural networks by addressing two key challenges: (1) extracting local spatial feature information from implicit joint connections that is unobtainable through standard graph convolutions on natural joint connections alone. (2) Capturing long-range temporal dependencies that extend beyond the limited temporal receptive fields of conventional temporal convolutions.
Methods
To achieve these objectives, we propose three novel modules integrated into the spatiotemporal graph convolutional framework: (1) a connectivity feature extraction module that employs attention to model implicit joint connections and extract their local spatial features. (2) A long-range frame difference feature extraction module that captures extensive temporal context by considering larger frame intervals. (3) A coordinate transformation module that enhances spatial representation by fusing Cartesian and spherical coordinate systems.
Findings
Evaluation across multiple datasets demonstrates that the proposed method achieves significant improvements over baseline networks, with the highest accuracy gains of 2.76\(\%\) on the NTU-RGB+D 60 dataset (Cross-subject), 4.1\(\%\) on NTU-RGB+D 120 (Cross-subject), and 4.3\(\%\) on Kinetics (Top-1), outperforming current state-of-the-art algorithms. This paper delves into the realm of behavior recognition technology, a cornerstone of autonomous systems, and presents a novel approach that enhances the accuracy and precision of human activity recognition.
期刊介绍:
The aim of the EURASIP Journal on Advances in Signal Processing is to highlight the theoretical and practical aspects of signal processing in new and emerging technologies. The journal is directed as much at the practicing engineer as at the academic researcher. Authors of articles with novel contributions to the theory and/or practice of signal processing are welcome to submit their articles for consideration.