Three-dimensional dynamic gesture recognition method based on convolutional neural network

IF 3 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

High-Confidence Computing Pub Date : 2024-11-06 DOI:10.1016/j.hcc.2024.100280

Ji Xi , Weiqi Zhang , Zhe Xu , Saide Zhu , Linlin Tang , Li Zhao

{"title":"Three-dimensional dynamic gesture recognition method based on convolutional neural network","authors":"Ji Xi , Weiqi Zhang , Zhe Xu , Saide Zhu , Linlin Tang , Li Zhao","doi":"10.1016/j.hcc.2024.100280","DOIUrl":null,"url":null,"abstract":"<div><div>With the rapid advancement of virtual reality, dynamic gesture recognition technology has become an indispensable and critical technique for users to achieve human–computer interaction in virtual environments. The recognition of dynamic gestures is a challenging task due to the high degree of freedom and the influence of individual differences and the change of gesture space. To solve the problem of low recognition accuracy of existing networks, an improved dynamic gesture recognition algorithm based on ResNeXt architecture is proposed. The algorithm employs three-dimensional convolution techniques to effectively capture the spatiotemporal features intrinsic to dynamic gestures. Additionally, to enhance the model’s focus and improve its accuracy in identifying dynamic gestures, a lightweight convolutional attention mechanism is introduced. This mechanism not only augments the model’s precision but also facilitates faster convergence during the training phase. In order to further optimize the performance of the model, a deep attention submodule is added to the convolutional attention mechanism module to strengthen the network’s capability in temporal feature extraction. Empirical evaluations on EgoGesture and NvGesture datasets show that the accuracy of the proposed model in dynamic gesture recognition reaches 95.03% and 86.21%, respectively. When operating in RGB mode, the accuracy reached 93.49% and 80.22%, respectively. These results underscore the effectiveness of the proposed algorithm in recognizing dynamic gestures with high accuracy, showcasing its potential for applications in advanced human–computer interaction systems.</div></div>","PeriodicalId":100605,"journal":{"name":"High-Confidence Computing","volume":"5 1","pages":"Article 100280"},"PeriodicalIF":3.0000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"High-Confidence Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667295224000837","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

With the rapid advancement of virtual reality, dynamic gesture recognition technology has become an indispensable and critical technique for users to achieve human–computer interaction in virtual environments. The recognition of dynamic gestures is a challenging task due to the high degree of freedom and the influence of individual differences and the change of gesture space. To solve the problem of low recognition accuracy of existing networks, an improved dynamic gesture recognition algorithm based on ResNeXt architecture is proposed. The algorithm employs three-dimensional convolution techniques to effectively capture the spatiotemporal features intrinsic to dynamic gestures. Additionally, to enhance the model’s focus and improve its accuracy in identifying dynamic gestures, a lightweight convolutional attention mechanism is introduced. This mechanism not only augments the model’s precision but also facilitates faster convergence during the training phase. In order to further optimize the performance of the model, a deep attention submodule is added to the convolutional attention mechanism module to strengthen the network’s capability in temporal feature extraction. Empirical evaluations on EgoGesture and NvGesture datasets show that the accuracy of the proposed model in dynamic gesture recognition reaches 95.03% and 86.21%, respectively. When operating in RGB mode, the accuracy reached 93.49% and 80.22%, respectively. These results underscore the effectiveness of the proposed algorithm in recognizing dynamic gestures with high accuracy, showcasing its potential for applications in advanced human–computer interaction systems.

查看原文本刊更多论文

基于卷积神经网络的三维动态手势识别方法

随着虚拟现实技术的飞速发展，动态手势识别技术已成为用户在虚拟环境中实现人机交互不可或缺的关键技术。动态手势的识别具有高度的自由度，并且受个体差异和手势空间变化的影响，是一项具有挑战性的任务。针对现有网络识别精度低的问题，提出了一种基于ResNeXt架构的改进动态手势识别算法。该算法采用三维卷积技术，有效捕捉动态手势的时空特征。此外，为了增强模型在识别动态手势时的注意力和准确性，引入了一种轻量级的卷积注意机制。这种机制不仅提高了模型的精度，而且有助于在训练阶段更快地收敛。为了进一步优化模型的性能，在卷积注意机制模块中增加了深度注意子模块，增强了网络在时间特征提取方面的能力。在EgoGesture和NvGesture数据集上的实证评估表明，该模型在动态手势识别中的准确率分别达到95.03%和86.21%。在RGB模式下，准确率分别达到93.49%和80.22%。这些结果强调了该算法在识别动态手势方面的有效性和准确性，展示了其在高级人机交互系统中的应用潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

High-Confidence Computing

CiteScore

4.70

自引率

0.00%

发文量