MCNet: A unified multi-center graph convolutional network based on skeletal behavior recognition

IF 6.2 2区 工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY
Haiping Zhang , Xinhao Zhang , Dongjing Wang , Fuxing Zhou , Junfeng Yan
{"title":"MCNet: A unified multi-center graph convolutional network based on skeletal behavior recognition","authors":"Haiping Zhang ,&nbsp;Xinhao Zhang ,&nbsp;Dongjing Wang ,&nbsp;Fuxing Zhou ,&nbsp;Junfeng Yan","doi":"10.1016/j.aej.2025.01.118","DOIUrl":null,"url":null,"abstract":"<div><div>The enhanced stability and computational efficiency of skeletal data render it a highly sought-after option for video action recognition. Although some progress has been made in existing research on skeleton behavior recognition based on graph convolutional networks (GCN), the fixation of the graph structure and the lack of interaction of the objects in the dataset with the objects lead to the lack of some flexibility of the traditional model in recognizing actions with a large degree of similarity. This will have an impact on the final performance of the model. To address these issues, we propose a unified multi-center graph convolutional network (MCNet) for skeletal behavior recognition. Some of the actions with a large movement amplitude will result in a change of the human body centers. A multi-center training approach is proposed for the recognition of such actions, in which three centers are defined in the construction of the topology graph. A Multi-Center Data Selector (MCDS) is employed to differentiate and select these centers, thereby enhancing the adaptability of the recognition task. Some of the action categories are easily confused with each other, and in order to facilitate the recognition of actions with high similarity, a multi-modal training scheme is proposed. This employs a large-scale language model as a knowledge engine to provide textual descriptions for global actions in different centers, thus enabling the differentiation of actions and further improvement of the recognition effect. Finally, an attention mechanism module is employed to aggregate the features of a multi-scale adjacency matrix along the channel dimension. In order to verify the effectiveness of the network model proposed in this paper, a series of ablation experiments and model analyses were conducted on three datasets. The model was also compared with other state-of-the-art models, including CTR-GCN, Info-GCN, and STF. The results demonstrated that the model proposed in this paper reached the SOTA level. MCNet outperforms CTR-GCN(Baseline) by 0.6% on X-Sub and 0.3% on X-View on the NTU RGB+D 60 dataset. On the NTU RGB+D 120 dataset, the performance is even more pronounced, with an improvement of up to 0.8% for the X-Sub and X-Set benchmarks, respectively.</div></div>","PeriodicalId":7484,"journal":{"name":"alexandria engineering journal","volume":"120 ","pages":"Pages 116-127"},"PeriodicalIF":6.2000,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"alexandria engineering journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110016825001462","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

The enhanced stability and computational efficiency of skeletal data render it a highly sought-after option for video action recognition. Although some progress has been made in existing research on skeleton behavior recognition based on graph convolutional networks (GCN), the fixation of the graph structure and the lack of interaction of the objects in the dataset with the objects lead to the lack of some flexibility of the traditional model in recognizing actions with a large degree of similarity. This will have an impact on the final performance of the model. To address these issues, we propose a unified multi-center graph convolutional network (MCNet) for skeletal behavior recognition. Some of the actions with a large movement amplitude will result in a change of the human body centers. A multi-center training approach is proposed for the recognition of such actions, in which three centers are defined in the construction of the topology graph. A Multi-Center Data Selector (MCDS) is employed to differentiate and select these centers, thereby enhancing the adaptability of the recognition task. Some of the action categories are easily confused with each other, and in order to facilitate the recognition of actions with high similarity, a multi-modal training scheme is proposed. This employs a large-scale language model as a knowledge engine to provide textual descriptions for global actions in different centers, thus enabling the differentiation of actions and further improvement of the recognition effect. Finally, an attention mechanism module is employed to aggregate the features of a multi-scale adjacency matrix along the channel dimension. In order to verify the effectiveness of the network model proposed in this paper, a series of ablation experiments and model analyses were conducted on three datasets. The model was also compared with other state-of-the-art models, including CTR-GCN, Info-GCN, and STF. The results demonstrated that the model proposed in this paper reached the SOTA level. MCNet outperforms CTR-GCN(Baseline) by 0.6% on X-Sub and 0.3% on X-View on the NTU RGB+D 60 dataset. On the NTU RGB+D 120 dataset, the performance is even more pronounced, with an improvement of up to 0.8% for the X-Sub and X-Set benchmarks, respectively.
求助全文
约1分钟内获得全文 求助全文
来源期刊
alexandria engineering journal
alexandria engineering journal Engineering-General Engineering
CiteScore
11.20
自引率
4.40%
发文量
1015
审稿时长
43 days
期刊介绍: Alexandria Engineering Journal is an international journal devoted to publishing high quality papers in the field of engineering and applied science. Alexandria Engineering Journal is cited in the Engineering Information Services (EIS) and the Chemical Abstracts (CA). The papers published in Alexandria Engineering Journal are grouped into five sections, according to the following classification: • Mechanical, Production, Marine and Textile Engineering • Electrical Engineering, Computer Science and Nuclear Engineering • Civil and Architecture Engineering • Chemical Engineering and Applied Sciences • Environmental Engineering
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信