A. Mohammed, Yuan Gao, Zhilong Ji, Jiancheng Lv, M. Islam, Yongsheng Sang
{"title":"Automatic 3D Skeleton-based Dynamic Hand Gesture Recognition Using Multi-Layer Convolutional LSTM","authors":"A. Mohammed, Yuan Gao, Zhilong Ji, Jiancheng Lv, M. Islam, Yongsheng Sang","doi":"10.1145/3505688.3505690","DOIUrl":null,"url":null,"abstract":"Accurate and real-time recognition of skeleton-based dynamic hand gestures has gained increasing attention in recent years with the development of depth sensors and the improved hand joints estimation algorithms. This task is challenging due to the spatial and temporal features that exacerbate the task complexity. Although previous works have applied different techniques, it remains challenging to efficiently and simultaneously encode the spatiotemporal features. To address this problem, this work presents a Deep Convolutional LSTM (DConvLSTM) model to learn more discriminative spatiotemporal features from skeleton data implicitly. The model employs multi-layer ConvLSTM to accurately capture the multiscale spatial and sequential information of the gesture and preserves a fast inference and lightweight size. Extensive experiments on three publicly available datasets show strong performance and demonstrate the superiority of our method by outperforming other methods. Furthermore, our method can achieve comparable recognition accuracy while maintaining small models and short inference time.","PeriodicalId":375528,"journal":{"name":"Proceedings of the 7th International Conference on Robotics and Artificial Intelligence","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Conference on Robotics and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3505688.3505690","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate and real-time recognition of skeleton-based dynamic hand gestures has gained increasing attention in recent years with the development of depth sensors and the improved hand joints estimation algorithms. This task is challenging due to the spatial and temporal features that exacerbate the task complexity. Although previous works have applied different techniques, it remains challenging to efficiently and simultaneously encode the spatiotemporal features. To address this problem, this work presents a Deep Convolutional LSTM (DConvLSTM) model to learn more discriminative spatiotemporal features from skeleton data implicitly. The model employs multi-layer ConvLSTM to accurately capture the multiscale spatial and sequential information of the gesture and preserves a fast inference and lightweight size. Extensive experiments on three publicly available datasets show strong performance and demonstrate the superiority of our method by outperforming other methods. Furthermore, our method can achieve comparable recognition accuracy while maintaining small models and short inference time.