A. Mohammed, Yuan Gao, Zhilong Ji, Jiancheng Lv, M. Islam, Yongsheng Sang
{"title":"基于多层卷积LSTM的自动三维骨架动态手势识别","authors":"A. Mohammed, Yuan Gao, Zhilong Ji, Jiancheng Lv, M. Islam, Yongsheng Sang","doi":"10.1145/3505688.3505690","DOIUrl":null,"url":null,"abstract":"Accurate and real-time recognition of skeleton-based dynamic hand gestures has gained increasing attention in recent years with the development of depth sensors and the improved hand joints estimation algorithms. This task is challenging due to the spatial and temporal features that exacerbate the task complexity. Although previous works have applied different techniques, it remains challenging to efficiently and simultaneously encode the spatiotemporal features. To address this problem, this work presents a Deep Convolutional LSTM (DConvLSTM) model to learn more discriminative spatiotemporal features from skeleton data implicitly. The model employs multi-layer ConvLSTM to accurately capture the multiscale spatial and sequential information of the gesture and preserves a fast inference and lightweight size. Extensive experiments on three publicly available datasets show strong performance and demonstrate the superiority of our method by outperforming other methods. Furthermore, our method can achieve comparable recognition accuracy while maintaining small models and short inference time.","PeriodicalId":375528,"journal":{"name":"Proceedings of the 7th International Conference on Robotics and Artificial Intelligence","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automatic 3D Skeleton-based Dynamic Hand Gesture Recognition Using Multi-Layer Convolutional LSTM\",\"authors\":\"A. Mohammed, Yuan Gao, Zhilong Ji, Jiancheng Lv, M. Islam, Yongsheng Sang\",\"doi\":\"10.1145/3505688.3505690\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurate and real-time recognition of skeleton-based dynamic hand gestures has gained increasing attention in recent years with the development of depth sensors and the improved hand joints estimation algorithms. This task is challenging due to the spatial and temporal features that exacerbate the task complexity. Although previous works have applied different techniques, it remains challenging to efficiently and simultaneously encode the spatiotemporal features. To address this problem, this work presents a Deep Convolutional LSTM (DConvLSTM) model to learn more discriminative spatiotemporal features from skeleton data implicitly. The model employs multi-layer ConvLSTM to accurately capture the multiscale spatial and sequential information of the gesture and preserves a fast inference and lightweight size. Extensive experiments on three publicly available datasets show strong performance and demonstrate the superiority of our method by outperforming other methods. Furthermore, our method can achieve comparable recognition accuracy while maintaining small models and short inference time.\",\"PeriodicalId\":375528,\"journal\":{\"name\":\"Proceedings of the 7th International Conference on Robotics and Artificial Intelligence\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th International Conference on Robotics and Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3505688.3505690\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Conference on Robotics and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3505688.3505690","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic 3D Skeleton-based Dynamic Hand Gesture Recognition Using Multi-Layer Convolutional LSTM
Accurate and real-time recognition of skeleton-based dynamic hand gestures has gained increasing attention in recent years with the development of depth sensors and the improved hand joints estimation algorithms. This task is challenging due to the spatial and temporal features that exacerbate the task complexity. Although previous works have applied different techniques, it remains challenging to efficiently and simultaneously encode the spatiotemporal features. To address this problem, this work presents a Deep Convolutional LSTM (DConvLSTM) model to learn more discriminative spatiotemporal features from skeleton data implicitly. The model employs multi-layer ConvLSTM to accurately capture the multiscale spatial and sequential information of the gesture and preserves a fast inference and lightweight size. Extensive experiments on three publicly available datasets show strong performance and demonstrate the superiority of our method by outperforming other methods. Furthermore, our method can achieve comparable recognition accuracy while maintaining small models and short inference time.