{"title":"基于CNN-RNN的时空方法在手势识别中的应用","authors":"Mochammad Rifky Gunawan, E. C. Djamal","doi":"10.1109/ic2ie53219.2021.9649108","DOIUrl":null,"url":null,"abstract":"One of the ways of communication in human- computer interaction is by hand gesturing through video, a collection of sequential images, and has a frame per second (fps) configuration so that the existing image could change at any time. Recognize hand gesture videos through the pattern of each frame and its connection. Therefore the recognition views them as images in time sequences. There are several approaches—the single spatial approach by collecting image sequences in large images. Even though it has good accuracy, it will have problems with a less responsive background and fast movement because it captures less information on image pattern changes from adjacent frames. Others take memory. The temporal approach focuses on comparing image patterns between frames but requires spatial information or patterns for each frame. It is not the only initial frame. Hence, it is appropriate to combine the two approaches simultaneously in motion recognition or movement called Spatio-Temporal. Convolution Neural Network (CNN) is good in image recognition. Recurrent Neural Networks (RNN) are usually suitable for recognizing sequences and their relationships. Therefore, for hand gesture recognition, this research used a Spatio-Temporal approach with the CNN-RNN method. CNN with Spatial-Streams get image patterns, and Temporal- Streams use RNN to get connected patterns. The results showed that the combination of CNN and RNN for the Spatio- Temporal approach could recognize one of the four-hand gestures by 96.43%. The experiments resulted in eight CNN convolution layers and two Dense layers in RNN with GRU and LSTM architectures.","PeriodicalId":178443,"journal":{"name":"2021 4th International Conference of Computer and Informatics Engineering (IC2IE)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Spatio-Temporal Approach using CNN-RNN in Hand Gesture Recognition\",\"authors\":\"Mochammad Rifky Gunawan, E. C. Djamal\",\"doi\":\"10.1109/ic2ie53219.2021.9649108\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the ways of communication in human- computer interaction is by hand gesturing through video, a collection of sequential images, and has a frame per second (fps) configuration so that the existing image could change at any time. Recognize hand gesture videos through the pattern of each frame and its connection. Therefore the recognition views them as images in time sequences. There are several approaches—the single spatial approach by collecting image sequences in large images. Even though it has good accuracy, it will have problems with a less responsive background and fast movement because it captures less information on image pattern changes from adjacent frames. Others take memory. The temporal approach focuses on comparing image patterns between frames but requires spatial information or patterns for each frame. It is not the only initial frame. Hence, it is appropriate to combine the two approaches simultaneously in motion recognition or movement called Spatio-Temporal. Convolution Neural Network (CNN) is good in image recognition. Recurrent Neural Networks (RNN) are usually suitable for recognizing sequences and their relationships. Therefore, for hand gesture recognition, this research used a Spatio-Temporal approach with the CNN-RNN method. CNN with Spatial-Streams get image patterns, and Temporal- Streams use RNN to get connected patterns. The results showed that the combination of CNN and RNN for the Spatio- Temporal approach could recognize one of the four-hand gestures by 96.43%. The experiments resulted in eight CNN convolution layers and two Dense layers in RNN with GRU and LSTM architectures.\",\"PeriodicalId\":178443,\"journal\":{\"name\":\"2021 4th International Conference of Computer and Informatics Engineering (IC2IE)\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 4th International Conference of Computer and Informatics Engineering (IC2IE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ic2ie53219.2021.9649108\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 4th International Conference of Computer and Informatics Engineering (IC2IE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ic2ie53219.2021.9649108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Spatio-Temporal Approach using CNN-RNN in Hand Gesture Recognition
One of the ways of communication in human- computer interaction is by hand gesturing through video, a collection of sequential images, and has a frame per second (fps) configuration so that the existing image could change at any time. Recognize hand gesture videos through the pattern of each frame and its connection. Therefore the recognition views them as images in time sequences. There are several approaches—the single spatial approach by collecting image sequences in large images. Even though it has good accuracy, it will have problems with a less responsive background and fast movement because it captures less information on image pattern changes from adjacent frames. Others take memory. The temporal approach focuses on comparing image patterns between frames but requires spatial information or patterns for each frame. It is not the only initial frame. Hence, it is appropriate to combine the two approaches simultaneously in motion recognition or movement called Spatio-Temporal. Convolution Neural Network (CNN) is good in image recognition. Recurrent Neural Networks (RNN) are usually suitable for recognizing sequences and their relationships. Therefore, for hand gesture recognition, this research used a Spatio-Temporal approach with the CNN-RNN method. CNN with Spatial-Streams get image patterns, and Temporal- Streams use RNN to get connected patterns. The results showed that the combination of CNN and RNN for the Spatio- Temporal approach could recognize one of the four-hand gestures by 96.43%. The experiments resulted in eight CNN convolution layers and two Dense layers in RNN with GRU and LSTM architectures.