使用双流 3DCNN 和 SubUNet 组合进行连续手语识别

JURNAL TEKNIK INFORMATIKA Pub Date : 2023-12-22 DOI:10.15408/jti.v16i2.27030

Haryo Pramanto, Suharjito Suharjito

{"title":"使用双流 3DCNN 和 SubUNet 组合进行连续手语识别","authors":"Haryo Pramanto, Suharjito Suharjito","doi":"10.15408/jti.v16i2.27030","DOIUrl":null,"url":null,"abstract":"Research on sign language recognition using deep learning has been carried out by many researchers in the field of computer science but there are still obstacles in achieving the expected level of accuracy. Not a few researchers who want to do research for Continuous Sign Language Recognition but are trapped into research for Isolated Sign Language Recognition. The purpose of this study was to find the best method for performing Continuous Sign Language Recognition using Deep Learning. The 2014 RWTH-PHOENIX-Weather dataset was used in this study. The dataset was obtained from a literature study conducted to find datasets that are commonly used in Continuous Sign Language Recognition research. The dataset is used to develop the proposed method. The combination of 3DCNN, LSTM and CTC models is used to form part of the proposed method architecture. The collected dataset is also converted into an Optical Flow frame sequence to be used as Two Stream input along with the original RGB frame sequence. Word Error Rate on the prediction results is used to review the performance of the developed method. Through this research, the best achieved Word Error Rate is 94.1% using the C3D BLSTM CTC model with spatio stream input.","PeriodicalId":506287,"journal":{"name":"JURNAL TEKNIK INFORMATIKA","volume":"28 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Continuous Sign Language Recognition Using Combination of Two Stream 3DCNN and SubUNet\",\"authors\":\"Haryo Pramanto, Suharjito Suharjito\",\"doi\":\"10.15408/jti.v16i2.27030\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Research on sign language recognition using deep learning has been carried out by many researchers in the field of computer science but there are still obstacles in achieving the expected level of accuracy. Not a few researchers who want to do research for Continuous Sign Language Recognition but are trapped into research for Isolated Sign Language Recognition. The purpose of this study was to find the best method for performing Continuous Sign Language Recognition using Deep Learning. The 2014 RWTH-PHOENIX-Weather dataset was used in this study. The dataset was obtained from a literature study conducted to find datasets that are commonly used in Continuous Sign Language Recognition research. The dataset is used to develop the proposed method. The combination of 3DCNN, LSTM and CTC models is used to form part of the proposed method architecture. The collected dataset is also converted into an Optical Flow frame sequence to be used as Two Stream input along with the original RGB frame sequence. Word Error Rate on the prediction results is used to review the performance of the developed method. Through this research, the best achieved Word Error Rate is 94.1% using the C3D BLSTM CTC model with spatio stream input.\",\"PeriodicalId\":506287,\"journal\":{\"name\":\"JURNAL TEKNIK INFORMATIKA\",\"volume\":\"28 3\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JURNAL TEKNIK INFORMATIKA\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15408/jti.v16i2.27030\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JURNAL TEKNIK INFORMATIKA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15408/jti.v16i2.27030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

计算机科学领域的许多研究人员都开展了利用深度学习进行手语识别的研究，但要达到预期的准确度仍存在障碍。想要进行连续手语识别研究的研究人员并不在少数，但他们却陷入了孤立手语识别的研究中。本研究的目的是找到使用深度学习进行连续手语识别的最佳方法。本研究使用了 2014 RWTH-PHOENIX-Weather 数据集。该数据集是通过文献研究获得的，文献研究的目的是找到连续手语识别研究中常用的数据集。该数据集用于开发拟议的方法。3DCNN 模型、LSTM 模型和 CTC 模型的组合构成了拟议方法架构的一部分。收集到的数据集还被转换成光流帧序列，与原始 RGB 帧序列一起用作双流输入。预测结果的字错误率用于评估所开发方法的性能。通过这项研究，使用具有空间流输入的 C3D BLSTM CTC 模型实现的最佳字错误率为 94.1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Continuous Sign Language Recognition Using Combination of Two Stream 3DCNN and SubUNet

Research on sign language recognition using deep learning has been carried out by many researchers in the field of computer science but there are still obstacles in achieving the expected level of accuracy. Not a few researchers who want to do research for Continuous Sign Language Recognition but are trapped into research for Isolated Sign Language Recognition. The purpose of this study was to find the best method for performing Continuous Sign Language Recognition using Deep Learning. The 2014 RWTH-PHOENIX-Weather dataset was used in this study. The dataset was obtained from a literature study conducted to find datasets that are commonly used in Continuous Sign Language Recognition research. The dataset is used to develop the proposed method. The combination of 3DCNN, LSTM and CTC models is used to form part of the proposed method architecture. The collected dataset is also converted into an Optical Flow frame sequence to be used as Two Stream input along with the original RGB frame sequence. Word Error Rate on the prediction results is used to review the performance of the developed method. Through this research, the best achieved Word Error Rate is 94.1% using the C3D BLSTM CTC model with spatio stream input.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JURNAL TEKNIK INFORMATIKA

自引率

0.00%

发文量