{"title":"基于RGB和深度流的连体神经网络的孤立符号识别","authors":"Anil Osman Tur, H. Keles","doi":"10.1109/EUROCON.2019.8861945","DOIUrl":null,"url":null,"abstract":"Sign recognition is a challenging problem due to high variance of the signs among different signers and multiple modalities of the input information. In addition, the challenges that exist in the action classification problems in computer vision are similar in this domain too, such as variations in illumination and background. In this work, we propose a Siamese Neural Network (SNN) architecture that is used to extract features from the RGB and the depth streams of a sign frame in parallel. We use a pretrained model for the SNN without any finetuning to our training data. We then apply global feature pooling to the depth and color features that the SNN generates and feed the concatenation of the selected features to a recurrent neural network (RNN) to discriminate the signs. We trained our model parameters with the Montalbano dataset and achieved 93.19% test accuracy with ResNet-50 and 91.61% with VGG-16 Network Models.","PeriodicalId":232097,"journal":{"name":"IEEE EUROCON 2019 -18th International Conference on Smart Technologies","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Isolated Sign Recognition with a Siamese Neural Network of RGB and Depth Streams\",\"authors\":\"Anil Osman Tur, H. Keles\",\"doi\":\"10.1109/EUROCON.2019.8861945\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sign recognition is a challenging problem due to high variance of the signs among different signers and multiple modalities of the input information. In addition, the challenges that exist in the action classification problems in computer vision are similar in this domain too, such as variations in illumination and background. In this work, we propose a Siamese Neural Network (SNN) architecture that is used to extract features from the RGB and the depth streams of a sign frame in parallel. We use a pretrained model for the SNN without any finetuning to our training data. We then apply global feature pooling to the depth and color features that the SNN generates and feed the concatenation of the selected features to a recurrent neural network (RNN) to discriminate the signs. We trained our model parameters with the Montalbano dataset and achieved 93.19% test accuracy with ResNet-50 and 91.61% with VGG-16 Network Models.\",\"PeriodicalId\":232097,\"journal\":{\"name\":\"IEEE EUROCON 2019 -18th International Conference on Smart Technologies\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE EUROCON 2019 -18th International Conference on Smart Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EUROCON.2019.8861945\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE EUROCON 2019 -18th International Conference on Smart Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUROCON.2019.8861945","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Isolated Sign Recognition with a Siamese Neural Network of RGB and Depth Streams
Sign recognition is a challenging problem due to high variance of the signs among different signers and multiple modalities of the input information. In addition, the challenges that exist in the action classification problems in computer vision are similar in this domain too, such as variations in illumination and background. In this work, we propose a Siamese Neural Network (SNN) architecture that is used to extract features from the RGB and the depth streams of a sign frame in parallel. We use a pretrained model for the SNN without any finetuning to our training data. We then apply global feature pooling to the depth and color features that the SNN generates and feed the concatenation of the selected features to a recurrent neural network (RNN) to discriminate the signs. We trained our model parameters with the Montalbano dataset and achieved 93.19% test accuracy with ResNet-50 and 91.61% with VGG-16 Network Models.