Bojian Liu , Yufeng Yao, Honggang Wang , Zengmin He, Anyang Dong
{"title":"Enhancing industrial human action recognition framework integrating skeleton data acquisition, data repair and optimized graph convolutional networks","authors":"Bojian Liu , Yufeng Yao, Honggang Wang , Zengmin He, Anyang Dong","doi":"10.1016/j.rcim.2025.103089","DOIUrl":null,"url":null,"abstract":"<div><div>The precise interpretation of human actions is crucial for seamless interaction and operational efficiency for industrial human-robot collaboration. However, existing skeleton-based action recognition methods focus on algorithmic applications while overlooking key challenges such as robust data acquisition, validation, and repair. Additionally, the scarcity of high-quality industrial datasets and the challenges in distinguishing similar actions further limit the capability to infer operators' intentions accurately. This paper presents a novel framework to address challenges utilizing integrating skeleton data acquisition, effective data augmentation method, and an optimized graph convolutional network. Specifically, the proposed framework employs a pose estimation method for 2D (two-dimensional) joint estimation and a 2D-to-3D (three-dimensional) lifting technique, supplemented with a robust method for repairing invalid skeleton data and a skeletal feature-based data augmentation strategy. To enhance action recognition, this paper introduces the Channel-Topology Refinement Graph Convolutional Network Plus (CTR-GCN-Plus), which incorporates dynamic topology learning and multi-channel feature aggregation, augmented with hand motion integration for finer differentiation of similar actions. The proposed framework is evaluated on an industrial assembly dataset incorporating challenging scenarios, such as occlusions and similar actions. Experimental results demonstrate that the proposed methods significantly improve accuracy, enhance recognition of similar actions, and effectively account for individual variations, outperforming existing approaches in industrial human-robot collaboration environments.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"97 ","pages":"Article 103089"},"PeriodicalIF":9.1000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Computer-integrated Manufacturing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0736584525001437","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
The precise interpretation of human actions is crucial for seamless interaction and operational efficiency for industrial human-robot collaboration. However, existing skeleton-based action recognition methods focus on algorithmic applications while overlooking key challenges such as robust data acquisition, validation, and repair. Additionally, the scarcity of high-quality industrial datasets and the challenges in distinguishing similar actions further limit the capability to infer operators' intentions accurately. This paper presents a novel framework to address challenges utilizing integrating skeleton data acquisition, effective data augmentation method, and an optimized graph convolutional network. Specifically, the proposed framework employs a pose estimation method for 2D (two-dimensional) joint estimation and a 2D-to-3D (three-dimensional) lifting technique, supplemented with a robust method for repairing invalid skeleton data and a skeletal feature-based data augmentation strategy. To enhance action recognition, this paper introduces the Channel-Topology Refinement Graph Convolutional Network Plus (CTR-GCN-Plus), which incorporates dynamic topology learning and multi-channel feature aggregation, augmented with hand motion integration for finer differentiation of similar actions. The proposed framework is evaluated on an industrial assembly dataset incorporating challenging scenarios, such as occlusions and similar actions. Experimental results demonstrate that the proposed methods significantly improve accuracy, enhance recognition of similar actions, and effectively account for individual variations, outperforming existing approaches in industrial human-robot collaboration environments.
期刊介绍:
The journal, Robotics and Computer-Integrated Manufacturing, focuses on sharing research applications that contribute to the development of new or enhanced robotics, manufacturing technologies, and innovative manufacturing strategies that are relevant to industry. Papers that combine theory and experimental validation are preferred, while review papers on current robotics and manufacturing issues are also considered. However, papers on traditional machining processes, modeling and simulation, supply chain management, and resource optimization are generally not within the scope of the journal, as there are more appropriate journals for these topics. Similarly, papers that are overly theoretical or mathematical will be directed to other suitable journals. The journal welcomes original papers in areas such as industrial robotics, human-robot collaboration in manufacturing, cloud-based manufacturing, cyber-physical production systems, big data analytics in manufacturing, smart mechatronics, machine learning, adaptive and sustainable manufacturing, and other fields involving unique manufacturing technologies.