Sign language recognition with recurrent neural network using human keypoint detection

Sang-Ki Ko, J. Son, Hyedong Jung
{"title":"Sign language recognition with recurrent neural network using human keypoint detection","authors":"Sang-Ki Ko, J. Son, Hyedong Jung","doi":"10.1145/3264746.3264805","DOIUrl":null,"url":null,"abstract":"We study the sign language recognition problem which is to translate the meaning of signs from visual input such as videos. It is well-known that many problems in the field of computer vision require a huge amount of dataset to train deep neural network models. We introduce the KETI sign language dataset which consists of 10,480 videos of high resolution and quality. Since different sign languages are used in different countries, the KETI sign language dataset can be the starting line for further research on the Korean sign language recognition. Using the sign language dataset, we develop a sign language recognition system by utilizing the human keypoints extracted from face, hand, and body parts. The extracted human keypoint vector is standardized by the mean and standard deviation of the keypoints and used as input to recurrent neural network (RNN). We show that our sign recognition system is robust even when the size of training data is not sufficient. Our system shows 89.5% classification accuracy for 100 sentences that can be used in emergency situations.","PeriodicalId":186790,"journal":{"name":"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3264746.3264805","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 38

Abstract

We study the sign language recognition problem which is to translate the meaning of signs from visual input such as videos. It is well-known that many problems in the field of computer vision require a huge amount of dataset to train deep neural network models. We introduce the KETI sign language dataset which consists of 10,480 videos of high resolution and quality. Since different sign languages are used in different countries, the KETI sign language dataset can be the starting line for further research on the Korean sign language recognition. Using the sign language dataset, we develop a sign language recognition system by utilizing the human keypoints extracted from face, hand, and body parts. The extracted human keypoint vector is standardized by the mean and standard deviation of the keypoints and used as input to recurrent neural network (RNN). We show that our sign recognition system is robust even when the size of training data is not sufficient. Our system shows 89.5% classification accuracy for 100 sentences that can be used in emergency situations.
基于人体关键点检测的递归神经网络手语识别
我们研究了从视频等视觉输入中翻译符号意义的手语识别问题。众所周知,计算机视觉领域的许多问题都需要大量的数据集来训练深度神经网络模型。本文介绍了由10480个高分辨率高质量视频组成的KETI手语数据集。由于不同国家使用的手语不同,KETI手语数据集可以作为进一步研究韩语手语识别的起点。利用该数据集,利用人脸、手部和身体部位提取的人体关键点,开发了一个手语识别系统。提取的人体关键点向量通过关键点的均值和标准差进行标准化,并作为递归神经网络(RNN)的输入。我们证明,即使在训练数据不足的情况下,我们的符号识别系统也具有鲁棒性。我们的系统对100个可用于紧急情况的句子的分类准确率为89.5%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信