{"title":"基于姿态估计的深度学习阿拉伯手语检测","authors":"M. Ismail, Shefa A. Dawwd, F. Ali","doi":"10.1109/IT-ELA52201.2021.9773404","DOIUrl":null,"url":null,"abstract":"It is necessary to determine if the person is signing or not and if the type of sign is static or dynamic when processing a series of video frames captured by the camera. The benefits of sign detection are: First, whether there is a sign to be recognized. Second, in the case of a static sign, only one frame should be used for sign recognition, while in the case of a dynamic sign, a series of frames should be used for sign recognition. The presented research aims to develop a model for a detect signer in a video stream for Arabic sign language and classify the signs among static, dynamic, and non-sign. A large dataset is needed to identify signs and get better results. Seven thousand five hundred videos were captured and collected for this purpose. The proposed system extracts keypoints of human poses in video frames using the MediaPipe library. Then it uses these keypoints to compute important features (distance and angles), training an Bidirectional Gated Recurrent Unit (BiGRU) model with those features to detect Sign Language of 99% test accuracy in real-time.","PeriodicalId":330552,"journal":{"name":"2021 2nd Information Technology To Enhance e-learning and Other Application (IT-ELA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Arabic Sign Language Detection Using Deep Learning Based Pose Estimation\",\"authors\":\"M. Ismail, Shefa A. Dawwd, F. Ali\",\"doi\":\"10.1109/IT-ELA52201.2021.9773404\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is necessary to determine if the person is signing or not and if the type of sign is static or dynamic when processing a series of video frames captured by the camera. The benefits of sign detection are: First, whether there is a sign to be recognized. Second, in the case of a static sign, only one frame should be used for sign recognition, while in the case of a dynamic sign, a series of frames should be used for sign recognition. The presented research aims to develop a model for a detect signer in a video stream for Arabic sign language and classify the signs among static, dynamic, and non-sign. A large dataset is needed to identify signs and get better results. Seven thousand five hundred videos were captured and collected for this purpose. The proposed system extracts keypoints of human poses in video frames using the MediaPipe library. Then it uses these keypoints to compute important features (distance and angles), training an Bidirectional Gated Recurrent Unit (BiGRU) model with those features to detect Sign Language of 99% test accuracy in real-time.\",\"PeriodicalId\":330552,\"journal\":{\"name\":\"2021 2nd Information Technology To Enhance e-learning and Other Application (IT-ELA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 2nd Information Technology To Enhance e-learning and Other Application (IT-ELA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IT-ELA52201.2021.9773404\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 2nd Information Technology To Enhance e-learning and Other Application (IT-ELA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IT-ELA52201.2021.9773404","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Arabic Sign Language Detection Using Deep Learning Based Pose Estimation
It is necessary to determine if the person is signing or not and if the type of sign is static or dynamic when processing a series of video frames captured by the camera. The benefits of sign detection are: First, whether there is a sign to be recognized. Second, in the case of a static sign, only one frame should be used for sign recognition, while in the case of a dynamic sign, a series of frames should be used for sign recognition. The presented research aims to develop a model for a detect signer in a video stream for Arabic sign language and classify the signs among static, dynamic, and non-sign. A large dataset is needed to identify signs and get better results. Seven thousand five hundred videos were captured and collected for this purpose. The proposed system extracts keypoints of human poses in video frames using the MediaPipe library. Then it uses these keypoints to compute important features (distance and angles), training an Bidirectional Gated Recurrent Unit (BiGRU) model with those features to detect Sign Language of 99% test accuracy in real-time.