{"title":"基于多路径混合残差神经网络的三维手语识别","authors":"Xiaoyu Shi, Xiaoli Jiao, Cangzhen Meng, Zhiyun Bian","doi":"10.1145/3529836.3529943","DOIUrl":null,"url":null,"abstract":"Abstract: Sign language is an important communicating method for deaf-mute people. In recent years, the hybrid model between the Bi-directional Long-Short Term Memory (BiLSTM) and 3D convolutional network model makes full use of the feature extraction ability of convolutional neural networks and the advantages of time series classification of the recurrent neural network model to achieve more accurate recognition. However, high precision, scalability and robustness are still important challenges in future sign language recognition research. The main research direction and responding research methods aim to improve the accuracy and speed of 3D poses and continuous sentences sign language recognition based on hybrid models with the upgrading of computer hardware equipment and network. The paper improves a novel residual neural network and then engages it to extract features and build models with BiLSTM. The proposed hybrid model combines the improved neural network and Bi-directional Long-Short Term Memory (BiLSTM). In order to validate the proposed algorithm, we introduce the Chalearn dataset and Sports-1M dataset captured with depth, color and stereo-IR sensors. On the two challenging datasets, our multi-path hybrid residual neural network achieves an accuracy of 78.9% and 82.7%, outperforms other state-of-the-art algorithms, and is close to human accuracy of 88.4%.","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"3D Sign language recognition based on multi-path hybrid residual neural network\",\"authors\":\"Xiaoyu Shi, Xiaoli Jiao, Cangzhen Meng, Zhiyun Bian\",\"doi\":\"10.1145/3529836.3529943\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract: Sign language is an important communicating method for deaf-mute people. In recent years, the hybrid model between the Bi-directional Long-Short Term Memory (BiLSTM) and 3D convolutional network model makes full use of the feature extraction ability of convolutional neural networks and the advantages of time series classification of the recurrent neural network model to achieve more accurate recognition. However, high precision, scalability and robustness are still important challenges in future sign language recognition research. The main research direction and responding research methods aim to improve the accuracy and speed of 3D poses and continuous sentences sign language recognition based on hybrid models with the upgrading of computer hardware equipment and network. The paper improves a novel residual neural network and then engages it to extract features and build models with BiLSTM. The proposed hybrid model combines the improved neural network and Bi-directional Long-Short Term Memory (BiLSTM). In order to validate the proposed algorithm, we introduce the Chalearn dataset and Sports-1M dataset captured with depth, color and stereo-IR sensors. On the two challenging datasets, our multi-path hybrid residual neural network achieves an accuracy of 78.9% and 82.7%, outperforms other state-of-the-art algorithms, and is close to human accuracy of 88.4%.\",\"PeriodicalId\":285191,\"journal\":{\"name\":\"2022 14th International Conference on Machine Learning and Computing (ICMLC)\",\"volume\":\"60 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 14th International Conference on Machine Learning and Computing (ICMLC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3529836.3529943\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3529836.3529943","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
3D Sign language recognition based on multi-path hybrid residual neural network
Abstract: Sign language is an important communicating method for deaf-mute people. In recent years, the hybrid model between the Bi-directional Long-Short Term Memory (BiLSTM) and 3D convolutional network model makes full use of the feature extraction ability of convolutional neural networks and the advantages of time series classification of the recurrent neural network model to achieve more accurate recognition. However, high precision, scalability and robustness are still important challenges in future sign language recognition research. The main research direction and responding research methods aim to improve the accuracy and speed of 3D poses and continuous sentences sign language recognition based on hybrid models with the upgrading of computer hardware equipment and network. The paper improves a novel residual neural network and then engages it to extract features and build models with BiLSTM. The proposed hybrid model combines the improved neural network and Bi-directional Long-Short Term Memory (BiLSTM). In order to validate the proposed algorithm, we introduce the Chalearn dataset and Sports-1M dataset captured with depth, color and stereo-IR sensors. On the two challenging datasets, our multi-path hybrid residual neural network achieves an accuracy of 78.9% and 82.7%, outperforms other state-of-the-art algorithms, and is close to human accuracy of 88.4%.