Deep Learning-based Feature Fusion for Action Recognition Using Skeleton Information

Fahad Ul Hassan Asif Mattoo, U. S. Khan, Tahir Nawaz, N. Rashid
{"title":"Deep Learning-based Feature Fusion for Action Recognition Using Skeleton Information","authors":"Fahad Ul Hassan Asif Mattoo, U. S. Khan, Tahir Nawaz, N. Rashid","doi":"10.1109/ICRAI57502.2023.10089577","DOIUrl":null,"url":null,"abstract":"Various action recognition systems have been proposed, but most of them are not feasible to be used in real-time applications. Skeleton-based action recognition has a low computational cost and is not affected by background changes. As the pose estimation models are becoming faster (almost real-time), a model was created with only 1.8M parameters named DD-net, which uses the skeleton information to predict the action. Recently an improved version of the model came out and was named TD-net. The model is very rich with geometric-based features but lacks motion-based features. To overcome this we added two motion features in the model named acceleration and velocity. These features were created using second order Taylor's approximation, in a window around the current frame. The model accuracy was compared with DD-net, TD-net, and state-of-the-art algorithms using three different datasets. An increase in accuracy is observed for all three datasets (i.e 1.1% for SHERC, 1.7% for FPHAB and 2% for JHMDB) when compared with TD-net.","PeriodicalId":447565,"journal":{"name":"2023 International Conference on Robotics and Automation in Industry (ICRAI)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Robotics and Automation in Industry (ICRAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRAI57502.2023.10089577","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Various action recognition systems have been proposed, but most of them are not feasible to be used in real-time applications. Skeleton-based action recognition has a low computational cost and is not affected by background changes. As the pose estimation models are becoming faster (almost real-time), a model was created with only 1.8M parameters named DD-net, which uses the skeleton information to predict the action. Recently an improved version of the model came out and was named TD-net. The model is very rich with geometric-based features but lacks motion-based features. To overcome this we added two motion features in the model named acceleration and velocity. These features were created using second order Taylor's approximation, in a window around the current frame. The model accuracy was compared with DD-net, TD-net, and state-of-the-art algorithms using three different datasets. An increase in accuracy is observed for all three datasets (i.e 1.1% for SHERC, 1.7% for FPHAB and 2% for JHMDB) when compared with TD-net.
基于深度学习的骨骼信息特征融合动作识别
人们提出了各种各样的动作识别系统,但大多数都不适合实时应用。基于骨骼的动作识别计算成本低,不受背景变化的影响。随着姿态估计模型变得越来越快(几乎是实时的),我们创建了一个只有1.8万个参数的模型,命名为DD-net,该模型使用骨架信息来预测动作。最近,该模型的改进版本问世,并被命名为TD-net。该模型具有丰富的几何特征,但缺乏基于运动的特征。为了克服这个问题,我们在模型中添加了两个运动特征,分别是加速度和速度。这些特征是在当前框架周围的窗口中使用二阶泰勒近似创建的。使用三种不同的数据集,将模型精度与DD-net、TD-net和最先进的算法进行比较。与TD-net相比,所有三个数据集的准确率都有所提高(即SHERC的准确率为1.1%,FPHAB的准确率为1.7%,JHMDB的准确率为2%)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信