利用深度神经网络在连续捕捉模式下将巴拿马手语翻译成西班牙语

Big Data and Cognitive Computing Pub Date : 2024-02-26 DOI:10.3390/bdcc8030025

Alvaro A. Teran-Quezada, Victor Lopez-Cabrera, J. Rangel, J. Sánchez-Galán

{"title":"利用深度神经网络在连续捕捉模式下将巴拿马手语翻译成西班牙语","authors":"Alvaro A. Teran-Quezada, Victor Lopez-Cabrera, J. Rangel, J. Sánchez-Galán","doi":"10.3390/bdcc8030025","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks (CNN) have provided great advances for the task of sign language recognition (SLR). However, recurrent neural networks (RNN) in the form of long–short-term memory (LSTM) have become a means for providing solutions to problems involving sequential data. This research proposes the development of a sign language translation system that converts Panamanian Sign Language (PSL) signs into text in Spanish using an LSTM model that, among many things, makes it possible to work with non-static signs (as sequential data). The deep learning model presented focuses on action detection, in this case, the execution of the signs. This involves processing in a precise manner the frames in which a sign language gesture is made. The proposal is a holistic solution that considers, in addition to the seeking of the hands of the speaker, the face and pose determinants. These were added due to the fact that when communicating through sign languages, other visual characteristics matter beyond hand gestures. For the training of this system, a data set of 330 videos (of 30 frames each) for five possible classes (different signs considered) was created. The model was tested having an accuracy of 98.8%, making this a valuable base system for effective communication between PSL users and Spanish speakers. In conclusion, this work provides an improvement of the state of the art for PSL–Spanish translation by using the possibilities of translatable signs via deep learning.","PeriodicalId":505155,"journal":{"name":"Big Data and Cognitive Computing","volume":"150 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sign-to-Text Translation from Panamanian Sign Language to Spanish in Continuous Capture Mode with Deep Neural Networks\",\"authors\":\"Alvaro A. Teran-Quezada, Victor Lopez-Cabrera, J. Rangel, J. Sánchez-Galán\",\"doi\":\"10.3390/bdcc8030025\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional neural networks (CNN) have provided great advances for the task of sign language recognition (SLR). However, recurrent neural networks (RNN) in the form of long–short-term memory (LSTM) have become a means for providing solutions to problems involving sequential data. This research proposes the development of a sign language translation system that converts Panamanian Sign Language (PSL) signs into text in Spanish using an LSTM model that, among many things, makes it possible to work with non-static signs (as sequential data). The deep learning model presented focuses on action detection, in this case, the execution of the signs. This involves processing in a precise manner the frames in which a sign language gesture is made. The proposal is a holistic solution that considers, in addition to the seeking of the hands of the speaker, the face and pose determinants. These were added due to the fact that when communicating through sign languages, other visual characteristics matter beyond hand gestures. For the training of this system, a data set of 330 videos (of 30 frames each) for five possible classes (different signs considered) was created. The model was tested having an accuracy of 98.8%, making this a valuable base system for effective communication between PSL users and Spanish speakers. In conclusion, this work provides an improvement of the state of the art for PSL–Spanish translation by using the possibilities of translatable signs via deep learning.\",\"PeriodicalId\":505155,\"journal\":{\"name\":\"Big Data and Cognitive Computing\",\"volume\":\"150 6\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Big Data and Cognitive Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/bdcc8030025\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data and Cognitive Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/bdcc8030025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

卷积神经网络（CNN）为手语识别（SLR）任务带来了巨大的进步。然而，长短期记忆（LSTM）形式的递归神经网络（RNN）已成为为涉及序列数据的问题提供解决方案的一种手段。本研究提出开发一种手语翻译系统，利用 LSTM 模型将巴拿马手语（PSL）符号转换为西班牙语文本。所介绍的深度学习模型侧重于动作检测，在这种情况下，就是手势的执行。这涉及以精确的方式处理手语手势的帧。该提案是一个整体解决方案，除了寻找说话者的手之外，还考虑了面部和姿势决定因素。之所以加入这些因素，是因为在通过手语进行交流时，除了手势之外，其他视觉特征也很重要。为了对该系统进行训练，我们创建了一个包含 330 个视频（每个视频 30 帧）的数据集，涉及五个可能的类别（考虑了不同的手势）。经测试，该模型的准确率为 98.8%，是 PSL 用户与讲西班牙语者进行有效交流的重要基础系统。总之，这项工作通过深度学习利用可翻译标志的可能性，改进了 PSL-西班牙语翻译的技术水平。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Sign-to-Text Translation from Panamanian Sign Language to Spanish in Continuous Capture Mode with Deep Neural Networks

Convolutional neural networks (CNN) have provided great advances for the task of sign language recognition (SLR). However, recurrent neural networks (RNN) in the form of long–short-term memory (LSTM) have become a means for providing solutions to problems involving sequential data. This research proposes the development of a sign language translation system that converts Panamanian Sign Language (PSL) signs into text in Spanish using an LSTM model that, among many things, makes it possible to work with non-static signs (as sequential data). The deep learning model presented focuses on action detection, in this case, the execution of the signs. This involves processing in a precise manner the frames in which a sign language gesture is made. The proposal is a holistic solution that considers, in addition to the seeking of the hands of the speaker, the face and pose determinants. These were added due to the fact that when communicating through sign languages, other visual characteristics matter beyond hand gestures. For the training of this system, a data set of 330 videos (of 30 frames each) for five possible classes (different signs considered) was created. The model was tested having an accuracy of 98.8%, making this a valuable base system for effective communication between PSL users and Spanish speakers. In conclusion, this work provides an improvement of the state of the art for PSL–Spanish translation by using the possibilities of translatable signs via deep learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Big Data and Cognitive Computing

自引率

0.00%

发文量