Channel attention and multi-scale graph neural networks for skeleton-based action recognition

IF 1.4 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Ronghao Dang, Chengju Liu, Meilin Liu, Qi Chen
{"title":"Channel attention and multi-scale graph neural networks for skeleton-based action recognition","authors":"Ronghao Dang, Chengju Liu, Meilin Liu, Qi Chen","doi":"10.3233/aic-210250","DOIUrl":null,"url":null,"abstract":"3D skeleton data has been widely used in action recognition as the skeleton-based method has achieved good performance in complex dynamic environments. The rise of spatio-temporal graph convolutions has attracted much attention to use graph convolution to extract spatial and temporal features together in the field of skeleton-based action recognition. However, due to the huge difference in the focus of spatial and temporal features, it is difficult to improve the efficiency of extracting the spatiotemporal features. In this paper, we propose a channel attention and multi-scale neural network (CA-MSN) for skeleton-based action recognition with a series of spatio-temporal extraction modules. We exploit the relationship of body joints hierarchically through two modules, i.e., a spatial module which uses the residual GCN network with the channel attention block to extract the high-level spatial features, and a temporal module which uses the multi-scale TCN network to extract the temporal features at different scales. We perform extensive experiments on both the NTU-RGBD60 and NTU-RGBD120 datasets to verify the effectiveness of our network. The comparison results show that our method achieves the state-of-the-art performance with the competitive computing speed. In order to test the application effect of our CA-MSN model, we design a multi-task tandem network consisting of 2D pose estimation, 2D to 3D pose regression and skeleton action recognition model. The end-to-end (RGB video-to-action type) recognition effect is demonstrated. The code is available at https://github.com/Rh-Dang/CA-MSN-action-recognition.git.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"106 1","pages":"187-205"},"PeriodicalIF":1.4000,"publicationDate":"2022-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Communications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/aic-210250","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 2

Abstract

3D skeleton data has been widely used in action recognition as the skeleton-based method has achieved good performance in complex dynamic environments. The rise of spatio-temporal graph convolutions has attracted much attention to use graph convolution to extract spatial and temporal features together in the field of skeleton-based action recognition. However, due to the huge difference in the focus of spatial and temporal features, it is difficult to improve the efficiency of extracting the spatiotemporal features. In this paper, we propose a channel attention and multi-scale neural network (CA-MSN) for skeleton-based action recognition with a series of spatio-temporal extraction modules. We exploit the relationship of body joints hierarchically through two modules, i.e., a spatial module which uses the residual GCN network with the channel attention block to extract the high-level spatial features, and a temporal module which uses the multi-scale TCN network to extract the temporal features at different scales. We perform extensive experiments on both the NTU-RGBD60 and NTU-RGBD120 datasets to verify the effectiveness of our network. The comparison results show that our method achieves the state-of-the-art performance with the competitive computing speed. In order to test the application effect of our CA-MSN model, we design a multi-task tandem network consisting of 2D pose estimation, 2D to 3D pose regression and skeleton action recognition model. The end-to-end (RGB video-to-action type) recognition effect is demonstrated. The code is available at https://github.com/Rh-Dang/CA-MSN-action-recognition.git.
基于骨架的动作识别的通道关注和多尺度图神经网络
由于基于骨骼的方法在复杂的动态环境中取得了良好的性能,三维骨骼数据在动作识别中得到了广泛的应用。随着时空图卷积的兴起,利用图卷积同时提取时空特征在基于骨架的动作识别领域受到了广泛关注。然而,由于时空特征的重点存在巨大差异,难以提高时空特征提取的效率。在本文中,我们提出了一种基于通道关注和多尺度神经网络(CA-MSN)的骨骼动作识别方法,该方法具有一系列时空提取模块。我们通过两个模块对人体关节之间的关系进行分层挖掘,即空间模块使用残差GCN网络和通道注意块提取高级空间特征,时间模块使用多尺度TCN网络提取不同尺度的时间特征。我们在NTU-RGBD60和NTU-RGBD120数据集上进行了大量的实验来验证我们的网络的有效性。对比结果表明,我们的方法在具有竞争力的计算速度下达到了最先进的性能。为了验证CA-MSN模型的应用效果,我们设计了一个由二维姿态估计、二维到三维姿态回归和骨骼动作识别模型组成的多任务串联网络。演示了端到端(RGB视频到动作类型)识别效果。代码可在https://github.com/Rh-Dang/CA-MSN-action-recognition.git上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
AI Communications
AI Communications 工程技术-计算机:人工智能
CiteScore
2.30
自引率
12.50%
发文量
34
审稿时长
4.5 months
期刊介绍: AI Communications is a journal on artificial intelligence (AI) which has a close relationship to EurAI (European Association for Artificial Intelligence, formerly ECCAI). It covers the whole AI community: Scientific institutions as well as commercial and industrial companies. AI Communications aims to enhance contacts and information exchange between AI researchers and developers, and to provide supranational information to those concerned with AI and advanced information processing. AI Communications publishes refereed articles concerning scientific and technical AI procedures, provided they are of sufficient interest to a large readership of both scientific and practical background. In addition it contains high-level background material, both at the technical level as well as the level of opinions, policies and news.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信