Daxin Liu , Yu Huang , Zhenyu Liu , Haoyang Mao , Pengcheng Kan , Jianrong Tan
{"title":"基于骨架的装配动作识别方法与人机协作装配的特征融合","authors":"Daxin Liu , Yu Huang , Zhenyu Liu , Haoyang Mao , Pengcheng Kan , Jianrong Tan","doi":"10.1016/j.jmsy.2024.08.019","DOIUrl":null,"url":null,"abstract":"<div><p>Human-robot collaborative assembly (HRCA) is one of the current trends of intelligent manufacturing, and assembly action recognition is the basis of and the key to HRCA. A multi-scale and multi-stream graph convolutional network (2MSGCN) for assembly action recognition is proposed in this paper. 2MSGCN takes the temporal skeleton sample as input and outputs the class of the assembly action to which the sample belongs. RGBD images of the operator performing the assembly actions are captured by three RGBD cameras mounted at different viewpoints and pre-processed to generate the complete human skeleton. A multi-scale and multi-stream (2MS) mechanism and a feature fusion mechanism are proposed to improve the recognition accuracy of 2MSGCN. The 2MS mechanism is designed to input the skeleton data to 2MSGCN in the form of a joint stream, a bone stream and a motion stream, while the joint stream further generates two sets of input with rough scales to represent features in higher dimensional human skeleton, which obtains information of different scales and streams in temporal skeleton samples. And the feature fusion mechanism enables the fused feature to retain the information of the sub-feature while incorporating union information between the sub-features. Also, the improved convolution operation based on Ghost module is introduced to the 2MSGCN to reduce the number of the parameters and floating-point operations (FLOPs) and improve the real-time performance. Considering that there will be transitional actions when the operator switches between assembly actions in the continuous assembly process, a transitional action classification (TAC) method is proposed to distinguish the transitional actions from the assembly actions. Experiments on the public dataset NTU RGB+D 60 (NTU 60) and a self-built assembly action dataset indicate that the proposed 2MSGCN outperforms the mainstream models in recognition accuracy and real-time performance.</p></div>","PeriodicalId":16227,"journal":{"name":"Journal of Manufacturing Systems","volume":"76 ","pages":"Pages 553-566"},"PeriodicalIF":12.2000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A skeleton-based assembly action recognition method with feature fusion for human-robot collaborative assembly\",\"authors\":\"Daxin Liu , Yu Huang , Zhenyu Liu , Haoyang Mao , Pengcheng Kan , Jianrong Tan\",\"doi\":\"10.1016/j.jmsy.2024.08.019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Human-robot collaborative assembly (HRCA) is one of the current trends of intelligent manufacturing, and assembly action recognition is the basis of and the key to HRCA. A multi-scale and multi-stream graph convolutional network (2MSGCN) for assembly action recognition is proposed in this paper. 2MSGCN takes the temporal skeleton sample as input and outputs the class of the assembly action to which the sample belongs. RGBD images of the operator performing the assembly actions are captured by three RGBD cameras mounted at different viewpoints and pre-processed to generate the complete human skeleton. A multi-scale and multi-stream (2MS) mechanism and a feature fusion mechanism are proposed to improve the recognition accuracy of 2MSGCN. The 2MS mechanism is designed to input the skeleton data to 2MSGCN in the form of a joint stream, a bone stream and a motion stream, while the joint stream further generates two sets of input with rough scales to represent features in higher dimensional human skeleton, which obtains information of different scales and streams in temporal skeleton samples. And the feature fusion mechanism enables the fused feature to retain the information of the sub-feature while incorporating union information between the sub-features. Also, the improved convolution operation based on Ghost module is introduced to the 2MSGCN to reduce the number of the parameters and floating-point operations (FLOPs) and improve the real-time performance. Considering that there will be transitional actions when the operator switches between assembly actions in the continuous assembly process, a transitional action classification (TAC) method is proposed to distinguish the transitional actions from the assembly actions. Experiments on the public dataset NTU RGB+D 60 (NTU 60) and a self-built assembly action dataset indicate that the proposed 2MSGCN outperforms the mainstream models in recognition accuracy and real-time performance.</p></div>\",\"PeriodicalId\":16227,\"journal\":{\"name\":\"Journal of Manufacturing Systems\",\"volume\":\"76 \",\"pages\":\"Pages 553-566\"},\"PeriodicalIF\":12.2000,\"publicationDate\":\"2024-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Manufacturing Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0278612524001821\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, INDUSTRIAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Manufacturing Systems","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0278612524001821","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
A skeleton-based assembly action recognition method with feature fusion for human-robot collaborative assembly
Human-robot collaborative assembly (HRCA) is one of the current trends of intelligent manufacturing, and assembly action recognition is the basis of and the key to HRCA. A multi-scale and multi-stream graph convolutional network (2MSGCN) for assembly action recognition is proposed in this paper. 2MSGCN takes the temporal skeleton sample as input and outputs the class of the assembly action to which the sample belongs. RGBD images of the operator performing the assembly actions are captured by three RGBD cameras mounted at different viewpoints and pre-processed to generate the complete human skeleton. A multi-scale and multi-stream (2MS) mechanism and a feature fusion mechanism are proposed to improve the recognition accuracy of 2MSGCN. The 2MS mechanism is designed to input the skeleton data to 2MSGCN in the form of a joint stream, a bone stream and a motion stream, while the joint stream further generates two sets of input with rough scales to represent features in higher dimensional human skeleton, which obtains information of different scales and streams in temporal skeleton samples. And the feature fusion mechanism enables the fused feature to retain the information of the sub-feature while incorporating union information between the sub-features. Also, the improved convolution operation based on Ghost module is introduced to the 2MSGCN to reduce the number of the parameters and floating-point operations (FLOPs) and improve the real-time performance. Considering that there will be transitional actions when the operator switches between assembly actions in the continuous assembly process, a transitional action classification (TAC) method is proposed to distinguish the transitional actions from the assembly actions. Experiments on the public dataset NTU RGB+D 60 (NTU 60) and a self-built assembly action dataset indicate that the proposed 2MSGCN outperforms the mainstream models in recognition accuracy and real-time performance.
期刊介绍:
The Journal of Manufacturing Systems is dedicated to showcasing cutting-edge fundamental and applied research in manufacturing at the systems level. Encompassing products, equipment, people, information, control, and support functions, manufacturing systems play a pivotal role in the economical and competitive development, production, delivery, and total lifecycle of products, meeting market and societal needs.
With a commitment to publishing archival scholarly literature, the journal strives to advance the state of the art in manufacturing systems and foster innovation in crafting efficient, robust, and sustainable manufacturing systems. The focus extends from equipment-level considerations to the broader scope of the extended enterprise. The Journal welcomes research addressing challenges across various scales, including nano, micro, and macro-scale manufacturing, and spanning diverse sectors such as aerospace, automotive, energy, and medical device manufacturing.