Automatic 3D Skeleton-based Dynamic Hand Gesture Recognition Using Multi-Layer Convolutional LSTM

Proceedings of the 7th International Conference on Robotics and Artificial Intelligence Pub Date : 2021-11-19 DOI:10.1145/3505688.3505690

A. Mohammed, Yuan Gao, Zhilong Ji, Jiancheng Lv, M. Islam, Yongsheng Sang

{"title":"Automatic 3D Skeleton-based Dynamic Hand Gesture Recognition Using Multi-Layer Convolutional LSTM","authors":"A. Mohammed, Yuan Gao, Zhilong Ji, Jiancheng Lv, M. Islam, Yongsheng Sang","doi":"10.1145/3505688.3505690","DOIUrl":null,"url":null,"abstract":"Accurate and real-time recognition of skeleton-based dynamic hand gestures has gained increasing attention in recent years with the development of depth sensors and the improved hand joints estimation algorithms. This task is challenging due to the spatial and temporal features that exacerbate the task complexity. Although previous works have applied different techniques, it remains challenging to efficiently and simultaneously encode the spatiotemporal features. To address this problem, this work presents a Deep Convolutional LSTM (DConvLSTM) model to learn more discriminative spatiotemporal features from skeleton data implicitly. The model employs multi-layer ConvLSTM to accurately capture the multiscale spatial and sequential information of the gesture and preserves a fast inference and lightweight size. Extensive experiments on three publicly available datasets show strong performance and demonstrate the superiority of our method by outperforming other methods. Furthermore, our method can achieve comparable recognition accuracy while maintaining small models and short inference time.","PeriodicalId":375528,"journal":{"name":"Proceedings of the 7th International Conference on Robotics and Artificial Intelligence","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Conference on Robotics and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3505688.3505690","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate and real-time recognition of skeleton-based dynamic hand gestures has gained increasing attention in recent years with the development of depth sensors and the improved hand joints estimation algorithms. This task is challenging due to the spatial and temporal features that exacerbate the task complexity. Although previous works have applied different techniques, it remains challenging to efficiently and simultaneously encode the spatiotemporal features. To address this problem, this work presents a Deep Convolutional LSTM (DConvLSTM) model to learn more discriminative spatiotemporal features from skeleton data implicitly. The model employs multi-layer ConvLSTM to accurately capture the multiscale spatial and sequential information of the gesture and preserves a fast inference and lightweight size. Extensive experiments on three publicly available datasets show strong performance and demonstrate the superiority of our method by outperforming other methods. Furthermore, our method can achieve comparable recognition accuracy while maintaining small models and short inference time.

查看原文本刊更多论文

基于多层卷积LSTM的自动三维骨架动态手势识别

近年来，随着深度传感器的发展和手部关节估计算法的改进，基于骨骼的动态手势的准确实时识别越来越受到人们的关注。由于空间和时间特征加剧了任务的复杂性，这一任务具有挑战性。尽管以往的研究已经应用了不同的技术，但如何有效地同时对时空特征进行编码仍然是一个挑战。为了解决这一问题，本研究提出了一种深度卷积LSTM (DConvLSTM)模型，以隐式地从骨架数据中学习更多的判别性时空特征。该模型采用多层ConvLSTM精确捕获手势的多尺度空间和序列信息，保持了快速推理和轻量级尺寸。在三个公开可用的数据集上进行的大量实验显示了强大的性能，并证明了我们的方法优于其他方法。此外，我们的方法可以在保持较小的模型和较短的推理时间的情况下获得相当的识别精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 7th International Conference on Robotics and Artificial Intelligence

自引率

0.00%

发文量