Deep Learning-based Feature Fusion for Action Recognition Using Skeleton Information

2023 International Conference on Robotics and Automation in Industry (ICRAI) Pub Date : 2023-03-03 DOI:10.1109/ICRAI57502.2023.10089577

Fahad Ul Hassan Asif Mattoo, U. S. Khan, Tahir Nawaz, N. Rashid

{"title":"Deep Learning-based Feature Fusion for Action Recognition Using Skeleton Information","authors":"Fahad Ul Hassan Asif Mattoo, U. S. Khan, Tahir Nawaz, N. Rashid","doi":"10.1109/ICRAI57502.2023.10089577","DOIUrl":null,"url":null,"abstract":"Various action recognition systems have been proposed, but most of them are not feasible to be used in real-time applications. Skeleton-based action recognition has a low computational cost and is not affected by background changes. As the pose estimation models are becoming faster (almost real-time), a model was created with only 1.8M parameters named DD-net, which uses the skeleton information to predict the action. Recently an improved version of the model came out and was named TD-net. The model is very rich with geometric-based features but lacks motion-based features. To overcome this we added two motion features in the model named acceleration and velocity. These features were created using second order Taylor's approximation, in a window around the current frame. The model accuracy was compared with DD-net, TD-net, and state-of-the-art algorithms using three different datasets. An increase in accuracy is observed for all three datasets (i.e 1.1% for SHERC, 1.7% for FPHAB and 2% for JHMDB) when compared with TD-net.","PeriodicalId":447565,"journal":{"name":"2023 International Conference on Robotics and Automation in Industry (ICRAI)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Robotics and Automation in Industry (ICRAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRAI57502.2023.10089577","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Various action recognition systems have been proposed, but most of them are not feasible to be used in real-time applications. Skeleton-based action recognition has a low computational cost and is not affected by background changes. As the pose estimation models are becoming faster (almost real-time), a model was created with only 1.8M parameters named DD-net, which uses the skeleton information to predict the action. Recently an improved version of the model came out and was named TD-net. The model is very rich with geometric-based features but lacks motion-based features. To overcome this we added two motion features in the model named acceleration and velocity. These features were created using second order Taylor's approximation, in a window around the current frame. The model accuracy was compared with DD-net, TD-net, and state-of-the-art algorithms using three different datasets. An increase in accuracy is observed for all three datasets (i.e 1.1% for SHERC, 1.7% for FPHAB and 2% for JHMDB) when compared with TD-net.

查看原文本刊更多论文

基于深度学习的骨骼信息特征融合动作识别

人们提出了各种各样的动作识别系统，但大多数都不适合实时应用。基于骨骼的动作识别计算成本低，不受背景变化的影响。随着姿态估计模型变得越来越快(几乎是实时的)，我们创建了一个只有1.8万个参数的模型，命名为DD-net，该模型使用骨架信息来预测动作。最近，该模型的改进版本问世，并被命名为TD-net。该模型具有丰富的几何特征，但缺乏基于运动的特征。为了克服这个问题，我们在模型中添加了两个运动特征，分别是加速度和速度。这些特征是在当前框架周围的窗口中使用二阶泰勒近似创建的。使用三种不同的数据集，将模型精度与DD-net、TD-net和最先进的算法进行比较。与TD-net相比，所有三个数据集的准确率都有所提高(即SHERC的准确率为1.1%，FPHAB的准确率为1.7%，JHMDB的准确率为2%)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 International Conference on Robotics and Automation in Industry (ICRAI)

自引率

0.00%

发文量