3D Sign language recognition based on multi-path hybrid residual neural network

Xiaoyu Shi, Xiaoli Jiao, Cangzhen Meng, Zhiyun Bian
{"title":"3D Sign language recognition based on multi-path hybrid residual neural network","authors":"Xiaoyu Shi, Xiaoli Jiao, Cangzhen Meng, Zhiyun Bian","doi":"10.1145/3529836.3529943","DOIUrl":null,"url":null,"abstract":"Abstract: Sign language is an important communicating method for deaf-mute people. In recent years, the hybrid model between the Bi-directional Long-Short Term Memory (BiLSTM) and 3D convolutional network model makes full use of the feature extraction ability of convolutional neural networks and the advantages of time series classification of the recurrent neural network model to achieve more accurate recognition. However, high precision, scalability and robustness are still important challenges in future sign language recognition research. The main research direction and responding research methods aim to improve the accuracy and speed of 3D poses and continuous sentences sign language recognition based on hybrid models with the upgrading of computer hardware equipment and network. The paper improves a novel residual neural network and then engages it to extract features and build models with BiLSTM. The proposed hybrid model combines the improved neural network and Bi-directional Long-Short Term Memory (BiLSTM). In order to validate the proposed algorithm, we introduce the Chalearn dataset and Sports-1M dataset captured with depth, color and stereo-IR sensors. On the two challenging datasets, our multi-path hybrid residual neural network achieves an accuracy of 78.9% and 82.7%, outperforms other state-of-the-art algorithms, and is close to human accuracy of 88.4%.","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3529836.3529943","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract: Sign language is an important communicating method for deaf-mute people. In recent years, the hybrid model between the Bi-directional Long-Short Term Memory (BiLSTM) and 3D convolutional network model makes full use of the feature extraction ability of convolutional neural networks and the advantages of time series classification of the recurrent neural network model to achieve more accurate recognition. However, high precision, scalability and robustness are still important challenges in future sign language recognition research. The main research direction and responding research methods aim to improve the accuracy and speed of 3D poses and continuous sentences sign language recognition based on hybrid models with the upgrading of computer hardware equipment and network. The paper improves a novel residual neural network and then engages it to extract features and build models with BiLSTM. The proposed hybrid model combines the improved neural network and Bi-directional Long-Short Term Memory (BiLSTM). In order to validate the proposed algorithm, we introduce the Chalearn dataset and Sports-1M dataset captured with depth, color and stereo-IR sensors. On the two challenging datasets, our multi-path hybrid residual neural network achieves an accuracy of 78.9% and 82.7%, outperforms other state-of-the-art algorithms, and is close to human accuracy of 88.4%.
基于多路径混合残差神经网络的三维手语识别
摘要:手语是聋哑人重要的交流方式。近年来,双向长短期记忆(BiLSTM)与三维卷积网络模型的混合模型充分利用了卷积神经网络的特征提取能力和递归神经网络模型的时间序列分类优势,实现了更准确的识别。然而,高精度、可扩展性和鲁棒性仍然是未来手语识别研究的重要挑战。主要的研究方向和相应的研究方法是随着计算机硬件设备和网络的升级,提高基于混合模型的三维姿势和连续句手语识别的准确性和速度。本文改进了一种新的残差神经网络,并利用该网络进行特征提取和模型构建。该混合模型将改进的神经网络与双向长短期记忆(BiLSTM)相结合。为了验证所提出的算法,我们引入了用深度、颜色和立体红外传感器捕获的Chalearn数据集和Sports-1M数据集。在这两个具有挑战性的数据集上,我们的多路径混合残差神经网络达到了78.9%和82.7%的准确率,优于其他最先进的算法,接近人类的88.4%的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信