3D Sign language recognition based on multi-path hybrid residual neural network

2022 14th International Conference on Machine Learning and Computing (ICMLC) Pub Date : 2022-02-18 DOI:10.1145/3529836.3529943

Xiaoyu Shi, Xiaoli Jiao, Cangzhen Meng, Zhiyun Bian

{"title":"3D Sign language recognition based on multi-path hybrid residual neural network","authors":"Xiaoyu Shi, Xiaoli Jiao, Cangzhen Meng, Zhiyun Bian","doi":"10.1145/3529836.3529943","DOIUrl":null,"url":null,"abstract":"Abstract: Sign language is an important communicating method for deaf-mute people. In recent years, the hybrid model between the Bi-directional Long-Short Term Memory (BiLSTM) and 3D convolutional network model makes full use of the feature extraction ability of convolutional neural networks and the advantages of time series classification of the recurrent neural network model to achieve more accurate recognition. However, high precision, scalability and robustness are still important challenges in future sign language recognition research. The main research direction and responding research methods aim to improve the accuracy and speed of 3D poses and continuous sentences sign language recognition based on hybrid models with the upgrading of computer hardware equipment and network. The paper improves a novel residual neural network and then engages it to extract features and build models with BiLSTM. The proposed hybrid model combines the improved neural network and Bi-directional Long-Short Term Memory (BiLSTM). In order to validate the proposed algorithm, we introduce the Chalearn dataset and Sports-1M dataset captured with depth, color and stereo-IR sensors. On the two challenging datasets, our multi-path hybrid residual neural network achieves an accuracy of 78.9% and 82.7%, outperforms other state-of-the-art algorithms, and is close to human accuracy of 88.4%.","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3529836.3529943","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Abstract: Sign language is an important communicating method for deaf-mute people. In recent years, the hybrid model between the Bi-directional Long-Short Term Memory (BiLSTM) and 3D convolutional network model makes full use of the feature extraction ability of convolutional neural networks and the advantages of time series classification of the recurrent neural network model to achieve more accurate recognition. However, high precision, scalability and robustness are still important challenges in future sign language recognition research. The main research direction and responding research methods aim to improve the accuracy and speed of 3D poses and continuous sentences sign language recognition based on hybrid models with the upgrading of computer hardware equipment and network. The paper improves a novel residual neural network and then engages it to extract features and build models with BiLSTM. The proposed hybrid model combines the improved neural network and Bi-directional Long-Short Term Memory (BiLSTM). In order to validate the proposed algorithm, we introduce the Chalearn dataset and Sports-1M dataset captured with depth, color and stereo-IR sensors. On the two challenging datasets, our multi-path hybrid residual neural network achieves an accuracy of 78.9% and 82.7%, outperforms other state-of-the-art algorithms, and is close to human accuracy of 88.4%.

查看原文本刊更多论文

基于多路径混合残差神经网络的三维手语识别

摘要:手语是聋哑人重要的交流方式。近年来，双向长短期记忆(BiLSTM)与三维卷积网络模型的混合模型充分利用了卷积神经网络的特征提取能力和递归神经网络模型的时间序列分类优势，实现了更准确的识别。然而，高精度、可扩展性和鲁棒性仍然是未来手语识别研究的重要挑战。主要的研究方向和相应的研究方法是随着计算机硬件设备和网络的升级，提高基于混合模型的三维姿势和连续句手语识别的准确性和速度。本文改进了一种新的残差神经网络，并利用该网络进行特征提取和模型构建。该混合模型将改进的神经网络与双向长短期记忆(BiLSTM)相结合。为了验证所提出的算法，我们引入了用深度、颜色和立体红外传感器捕获的Chalearn数据集和Sports-1M数据集。在这两个具有挑战性的数据集上，我们的多路径混合残差神经网络达到了78.9%和82.7%的准确率，优于其他最先进的算法，接近人类的88.4%的准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 14th International Conference on Machine Learning and Computing (ICMLC)

自引率

0.00%

发文量