{"title":"Research on Deep Learning with Gesture Recognition and LSTM in Sign Language","authors":"Yi-Jiuan Chung, Chih-Hsiung Shen","doi":"10.1109/ICKII55100.2022.9983520","DOIUrl":null,"url":null,"abstract":"Sign language is a tool for the hearing impaired to communicate with each other. It is a channel for expressing thoughts and emotions, and also one of the ways they communicate with ordinary people. However, not everyone can read sign language. For those who do not understand sign language, it is difficult to receive its meaning quickly. At the same time, it also causes inconvenience for the hearing impaired. Thus, gesture recognition combined with deep learning techniques of Long Short-Term Memory (LSTM) is used to translate sign language into sentences with the correct meaning in this study. Then, the convenience of communication between hearing-impaired and ordinary people can be enhanced. People can more easily understand the sign language expressions of hearing-impaired and improve their willingness to communicate and interact. The study of Sign Language Recognition (SLR) is to translate the gesture and continuity of the sign language for expressing the semantics, which provides a convenient tool for communication. In this study, we constructed a complete recognition model by combining a Convolutional Neural Network (CNN) with Long Short-Term Memory (LSTM) neural network to complete continuous recognition work. An ordered image sequence is extracted from the video and converted into a vector through the image database for training and learning sign language using the powerful image recognition capabilities of CNN. Next, the LSTM model is used to connect with the fully connected layer of CNN to complete the accomplished semantic recognition. In particular, the concept of Recurrent Neural Network (RNN) is suitable for time series data processing and the construction of sequence data learning. After making modifications to the traditional RNN architecture, the LSTM performs better in terms of memory and appropriate data length. We built gesture and sign language datasets and adopted the CNN-LSTM recognition method. As a result, a higher recognition rate was achieved with a smaller training set, which meets the needs of real-time SLR systems.","PeriodicalId":352222,"journal":{"name":"2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII )","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 5th International Conference on Knowledge Innovation and Invention (ICKII )","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICKII55100.2022.9983520","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Sign language is a tool for the hearing impaired to communicate with each other. It is a channel for expressing thoughts and emotions, and also one of the ways they communicate with ordinary people. However, not everyone can read sign language. For those who do not understand sign language, it is difficult to receive its meaning quickly. At the same time, it also causes inconvenience for the hearing impaired. Thus, gesture recognition combined with deep learning techniques of Long Short-Term Memory (LSTM) is used to translate sign language into sentences with the correct meaning in this study. Then, the convenience of communication between hearing-impaired and ordinary people can be enhanced. People can more easily understand the sign language expressions of hearing-impaired and improve their willingness to communicate and interact. The study of Sign Language Recognition (SLR) is to translate the gesture and continuity of the sign language for expressing the semantics, which provides a convenient tool for communication. In this study, we constructed a complete recognition model by combining a Convolutional Neural Network (CNN) with Long Short-Term Memory (LSTM) neural network to complete continuous recognition work. An ordered image sequence is extracted from the video and converted into a vector through the image database for training and learning sign language using the powerful image recognition capabilities of CNN. Next, the LSTM model is used to connect with the fully connected layer of CNN to complete the accomplished semantic recognition. In particular, the concept of Recurrent Neural Network (RNN) is suitable for time series data processing and the construction of sequence data learning. After making modifications to the traditional RNN architecture, the LSTM performs better in terms of memory and appropriate data length. We built gesture and sign language datasets and adopted the CNN-LSTM recognition method. As a result, a higher recognition rate was achieved with a smaller training set, which meets the needs of real-time SLR systems.