使用双流 3DCNN 和 SubUNet 组合进行连续手语识别

Haryo Pramanto, Suharjito Suharjito
{"title":"使用双流 3DCNN 和 SubUNet 组合进行连续手语识别","authors":"Haryo Pramanto, Suharjito Suharjito","doi":"10.15408/jti.v16i2.27030","DOIUrl":null,"url":null,"abstract":"Research on sign language recognition using deep learning has been carried out by many researchers in the field of computer science but there are still obstacles in achieving the expected level of accuracy. Not a few researchers who want to do research for Continuous Sign Language Recognition but are trapped into research for Isolated Sign Language Recognition. The purpose of this study was to find the best method for performing Continuous Sign Language Recognition using Deep Learning. The 2014 RWTH-PHOENIX-Weather dataset was used in this study. The dataset was obtained from a literature study conducted to find datasets that are commonly used in Continuous Sign Language Recognition research. The dataset is used to develop the proposed method. The combination of 3DCNN, LSTM and CTC models is used to form part of the proposed method architecture. The collected dataset is also converted into an Optical Flow frame sequence to be used as Two Stream input along with the original RGB frame sequence. Word Error Rate on the prediction results is used to review the performance of the developed method. Through this research, the best achieved Word Error Rate is 94.1% using the C3D BLSTM CTC model with spatio stream input.","PeriodicalId":506287,"journal":{"name":"JURNAL TEKNIK INFORMATIKA","volume":"28 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Continuous Sign Language Recognition Using Combination of Two Stream 3DCNN and SubUNet\",\"authors\":\"Haryo Pramanto, Suharjito Suharjito\",\"doi\":\"10.15408/jti.v16i2.27030\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Research on sign language recognition using deep learning has been carried out by many researchers in the field of computer science but there are still obstacles in achieving the expected level of accuracy. Not a few researchers who want to do research for Continuous Sign Language Recognition but are trapped into research for Isolated Sign Language Recognition. The purpose of this study was to find the best method for performing Continuous Sign Language Recognition using Deep Learning. The 2014 RWTH-PHOENIX-Weather dataset was used in this study. The dataset was obtained from a literature study conducted to find datasets that are commonly used in Continuous Sign Language Recognition research. The dataset is used to develop the proposed method. The combination of 3DCNN, LSTM and CTC models is used to form part of the proposed method architecture. The collected dataset is also converted into an Optical Flow frame sequence to be used as Two Stream input along with the original RGB frame sequence. Word Error Rate on the prediction results is used to review the performance of the developed method. Through this research, the best achieved Word Error Rate is 94.1% using the C3D BLSTM CTC model with spatio stream input.\",\"PeriodicalId\":506287,\"journal\":{\"name\":\"JURNAL TEKNIK INFORMATIKA\",\"volume\":\"28 3\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JURNAL TEKNIK INFORMATIKA\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15408/jti.v16i2.27030\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JURNAL TEKNIK INFORMATIKA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15408/jti.v16i2.27030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

计算机科学领域的许多研究人员都开展了利用深度学习进行手语识别的研究,但要达到预期的准确度仍存在障碍。想要进行连续手语识别研究的研究人员并不在少数,但他们却陷入了孤立手语识别的研究中。本研究的目的是找到使用深度学习进行连续手语识别的最佳方法。本研究使用了 2014 RWTH-PHOENIX-Weather 数据集。该数据集是通过文献研究获得的,文献研究的目的是找到连续手语识别研究中常用的数据集。该数据集用于开发拟议的方法。3DCNN 模型、LSTM 模型和 CTC 模型的组合构成了拟议方法架构的一部分。收集到的数据集还被转换成光流帧序列,与原始 RGB 帧序列一起用作双流输入。预测结果的字错误率用于评估所开发方法的性能。通过这项研究,使用具有空间流输入的 C3D BLSTM CTC 模型实现的最佳字错误率为 94.1%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Continuous Sign Language Recognition Using Combination of Two Stream 3DCNN and SubUNet
Research on sign language recognition using deep learning has been carried out by many researchers in the field of computer science but there are still obstacles in achieving the expected level of accuracy. Not a few researchers who want to do research for Continuous Sign Language Recognition but are trapped into research for Isolated Sign Language Recognition. The purpose of this study was to find the best method for performing Continuous Sign Language Recognition using Deep Learning. The 2014 RWTH-PHOENIX-Weather dataset was used in this study. The dataset was obtained from a literature study conducted to find datasets that are commonly used in Continuous Sign Language Recognition research. The dataset is used to develop the proposed method. The combination of 3DCNN, LSTM and CTC models is used to form part of the proposed method architecture. The collected dataset is also converted into an Optical Flow frame sequence to be used as Two Stream input along with the original RGB frame sequence. Word Error Rate on the prediction results is used to review the performance of the developed method. Through this research, the best achieved Word Error Rate is 94.1% using the C3D BLSTM CTC model with spatio stream input.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信