基于模块选择的高效骨骼人体动作识别方法

IF 3.4 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays Pub Date : 2025-09-30 DOI:10.1016/j.displa.2025.103233

Shurong Chai , Rahul Kumar Jain , Shiyu Teng , Jiaqing Liu , Tomoko Tateyama , Yen-Wei Chen

{"title":"基于模块选择的高效骨骼人体动作识别方法","authors":"Shurong Chai , Rahul Kumar Jain , Shiyu Teng , Jiaqing Liu , Tomoko Tateyama , Yen-Wei Chen","doi":"10.1016/j.displa.2025.103233","DOIUrl":null,"url":null,"abstract":"<div><div>Human action recognition has become a key aspect of human–computer interaction nowadays. Existing spatial–temporal networks-based human action recognition methods have achieved better performance but at the high cost of computational complexity. These methods make the final predictions using a stack of blocks, where each block contains a spatial and a temporal module for extracting the respective features. Whereas an alternative arrangement of these blocks in the network may affect the optimal configuration for each specific sample. Moreover, these methods need a high inference time, consequently their implementation on cutting-edge low-spec devices is challenging. To resolve these limitations, we propose a decision network-based adaptive framework that dynamically determines the arrangement of the spatial and temporal modules to ensure a cost-effective network design. To determine the optimal network structure, we have investigated module selection decision-making schemes at local and global level. We have conducted extensive experiments using three publicly available datasets. The results show our proposed framework arranges the modules in an optimal way and efficiently reduces the computation cost while maintaining the performance. Our code is available at <span><span>https://github.com/11yxk/dynamic_skeleton</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"91 ","pages":"Article 103233"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A module selection-based approach for efficient skeleton human action recognition\",\"authors\":\"Shurong Chai , Rahul Kumar Jain , Shiyu Teng , Jiaqing Liu , Tomoko Tateyama , Yen-Wei Chen\",\"doi\":\"10.1016/j.displa.2025.103233\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Human action recognition has become a key aspect of human–computer interaction nowadays. Existing spatial–temporal networks-based human action recognition methods have achieved better performance but at the high cost of computational complexity. These methods make the final predictions using a stack of blocks, where each block contains a spatial and a temporal module for extracting the respective features. Whereas an alternative arrangement of these blocks in the network may affect the optimal configuration for each specific sample. Moreover, these methods need a high inference time, consequently their implementation on cutting-edge low-spec devices is challenging. To resolve these limitations, we propose a decision network-based adaptive framework that dynamically determines the arrangement of the spatial and temporal modules to ensure a cost-effective network design. To determine the optimal network structure, we have investigated module selection decision-making schemes at local and global level. We have conducted extensive experiments using three publicly available datasets. The results show our proposed framework arranges the modules in an optimal way and efficiently reduces the computation cost while maintaining the performance. Our code is available at <span><span>https://github.com/11yxk/dynamic_skeleton</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50570,\"journal\":{\"name\":\"Displays\",\"volume\":\"91 \",\"pages\":\"Article 103233\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Displays\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0141938225002707\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225002707","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

人体动作识别已成为当今人机交互的一个重要方面。现有的基于时空网络的人体动作识别方法虽然取得了较好的识别效果，但其代价是计算复杂度较高。这些方法使用一堆块进行最终预测，其中每个块包含用于提取各自特征的空间和时间模块。然而，网络中这些块的另一种排列可能会影响每个特定样本的最佳配置。此外，这些方法需要很高的推断时间，因此它们在尖端低规格设备上的实现具有挑战性。为了解决这些限制，我们提出了一个基于决策网络的自适应框架，动态地确定空间和时间模块的排列，以确保网络设计的成本效益。为了确定最优的网络结构，我们研究了局部和全局层面的模块选择决策方案。我们使用三个公开可用的数据集进行了广泛的实验。结果表明，我们提出的框架以最优的方式排列模块，在保持性能的同时有效地降低了计算成本。我们的代码可在https://github.com/11yxk/dynamic_skeleton上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A module selection-based approach for efficient skeleton human action recognition

Human action recognition has become a key aspect of human–computer interaction nowadays. Existing spatial–temporal networks-based human action recognition methods have achieved better performance but at the high cost of computational complexity. These methods make the final predictions using a stack of blocks, where each block contains a spatial and a temporal module for extracting the respective features. Whereas an alternative arrangement of these blocks in the network may affect the optimal configuration for each specific sample. Moreover, these methods need a high inference time, consequently their implementation on cutting-edge low-spec devices is challenging. To resolve these limitations, we propose a decision network-based adaptive framework that dynamically determines the arrangement of the spatial and temporal modules to ensure a cost-effective network design. To determine the optimal network structure, we have investigated module selection decision-making schemes at local and global level. We have conducted extensive experiments using three publicly available datasets. The results show our proposed framework arranges the modules in an optimal way and efficiently reduces the computation cost while maintaining the performance. Our code is available at https://github.com/11yxk/dynamic_skeleton.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Displays 工程技术-工程：电子与电气

CiteScore

4.60

自引率

25.60%

发文量

138

审稿时长

92 days

期刊介绍： Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.