An efficient hand gesture recognition based on optimal deep embedded hybrid convolutional neural network‐long short term memory network model

Gajalakshmi Palanisamy, T. Sharmila
{"title":"An efficient hand gesture recognition based on optimal deep embedded hybrid convolutional neural network‐long short term memory network model","authors":"Gajalakshmi Palanisamy, T. Sharmila","doi":"10.1002/cpe.7109","DOIUrl":null,"url":null,"abstract":"Hand gestures are the nonverbal communication done by individuals who cannot represent their thoughts in form of words. It is mainly used during human‐computer interaction (HCI), deaf and mute people interaction, and other robotic interface applications. Gesture recognition is a field of computer science mainly focused on improving the HCI via touch screens, cameras, and kinetic devices. The state‐of‐art systems mainly used computer vision‐based techniques that utilize both the motion sensor and camera to capture the hand gestures in real‐time and interprets them via the usage of the machine learning algorithms. Conventional machine learning algorithms often suffer from the different complexities present in the visible hand gesture images such as skin color, distance, light, hand direction, position, and background. In this article, an adaptive weighted multi‐scale resolution (AWMSR) network with a deep embedded hybrid convolutional neural network and long short term memory network (hybrid CNN‐LSTM) is proposed for identifying the different hand gesture signs with higher recognition accuracy. The proposed methodology is formulated using three steps: input preprocessing, feature extraction, and classification. To improve the complex visual effects present in the input images, a histogram equalization technique is used which improves the size of the gray level pixel in the image and also their occurrence probability. The multi‐block local binary pattern (MB‐LBP) algorithm is employed for feature extraction which extracts the crucial features present in the image such as hand shape structure feature, curvature feature, and invariant movements. The AWMSR with the deep embedded hybrid CNN–LSTM network is applied in the two‐benchmark datasets namely Jochen Triesch static hand posture and NUS hand posture dataset‐II to detect its stability in identifying different hand gestures. The weight function of the deep embedded CNN‐LSTM architecture is optimized using the puzzle optimization algorithm. The efficiency of the proposed methodology is verified in terms of different performance evaluation metrics such as accuracy, loss, confusion matrix, Intersection over the union, and execution time. The proposed methodology offers recognition accuracy of 97.86% and 98.32% for both datasets.","PeriodicalId":10584,"journal":{"name":"Concurrency and Computation: Practice and Experience","volume":"56 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation: Practice and Experience","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/cpe.7109","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Hand gestures are the nonverbal communication done by individuals who cannot represent their thoughts in form of words. It is mainly used during human‐computer interaction (HCI), deaf and mute people interaction, and other robotic interface applications. Gesture recognition is a field of computer science mainly focused on improving the HCI via touch screens, cameras, and kinetic devices. The state‐of‐art systems mainly used computer vision‐based techniques that utilize both the motion sensor and camera to capture the hand gestures in real‐time and interprets them via the usage of the machine learning algorithms. Conventional machine learning algorithms often suffer from the different complexities present in the visible hand gesture images such as skin color, distance, light, hand direction, position, and background. In this article, an adaptive weighted multi‐scale resolution (AWMSR) network with a deep embedded hybrid convolutional neural network and long short term memory network (hybrid CNN‐LSTM) is proposed for identifying the different hand gesture signs with higher recognition accuracy. The proposed methodology is formulated using three steps: input preprocessing, feature extraction, and classification. To improve the complex visual effects present in the input images, a histogram equalization technique is used which improves the size of the gray level pixel in the image and also their occurrence probability. The multi‐block local binary pattern (MB‐LBP) algorithm is employed for feature extraction which extracts the crucial features present in the image such as hand shape structure feature, curvature feature, and invariant movements. The AWMSR with the deep embedded hybrid CNN–LSTM network is applied in the two‐benchmark datasets namely Jochen Triesch static hand posture and NUS hand posture dataset‐II to detect its stability in identifying different hand gestures. The weight function of the deep embedded CNN‐LSTM architecture is optimized using the puzzle optimization algorithm. The efficiency of the proposed methodology is verified in terms of different performance evaluation metrics such as accuracy, loss, confusion matrix, Intersection over the union, and execution time. The proposed methodology offers recognition accuracy of 97.86% and 98.32% for both datasets.
基于最优深度嵌入式混合卷积神经网络-长短期记忆网络模型的高效手势识别
手势是那些无法用语言表达自己想法的人进行的非语言交流。它主要用于人机交互(HCI),聋哑人交互和其他机器人接口应用。手势识别是计算机科学的一个领域,主要致力于通过触摸屏、摄像头和动力设备来改善人机交互。最先进的系统主要使用基于计算机视觉的技术,利用运动传感器和摄像头实时捕捉手势,并通过使用机器学习算法对其进行解释。传统的机器学习算法经常受到可见手势图像中存在的不同复杂性的影响,例如肤色、距离、光线、手的方向、位置和背景。本文提出了一种基于深度嵌入式混合卷积神经网络和长短期记忆网络(hybrid CNN - LSTM)的自适应加权多尺度分辨率(AWMSR)网络,用于识别不同的手势符号,具有更高的识别精度。该方法分为三个步骤:输入预处理、特征提取和分类。为了改善输入图像中存在的复杂视觉效果,采用了直方图均衡化技术,提高了图像中灰度像素的大小及其出现概率。特征提取采用多块局部二值模式(MB - LBP)算法,提取图像中存在的关键特征,如手部形状结构特征、曲率特征和不变性运动特征。将深度嵌入CNN-LSTM混合网络的AWMSR应用于Jochen Triesch静态手姿和NUS手姿数据集II两个基准数据集,检测其识别不同手势的稳定性。采用谜题优化算法对深度嵌入式CNN - LSTM体系结构的权函数进行优化。根据不同的性能评估指标,如准确性、损失、混淆矩阵、联合交集和执行时间,验证了所提出方法的效率。本文提出的方法对两个数据集的识别准确率分别为97.86%和98.32%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信