带ELU激活函数的两流GRU模型用于手语识别

Intelligent Systems with Applications Pub Date : 2025-04-05 DOI:10.1016/j.iswa.2025.200513

Kasian Myagila , Devotha Godfrey Nyambo , Mussa Ally Dida

{"title":"带ELU激活函数的两流GRU模型用于手语识别","authors":"Kasian Myagila , Devotha Godfrey Nyambo , Mussa Ally Dida","doi":"10.1016/j.iswa.2025.200513","DOIUrl":null,"url":null,"abstract":"<div><div>Pose Estimation features have been successfully used in human activity recognition including sign language recognition. One of the key challenges in sign language recognition is handling signer-independent modes and hand dominance of signer. This paper proposes the use of the Gated Recurrent Unit (GRU) with the ELU activation function to improve computation efficiency and to enhance model learning efficiency. In addition, the paper proposes two stream model architecture to address the challenge of left and right-hand dominance. The study developed model using a Tanzania Sign language datasets collected using mobile devices and extracted pose estimation feature using MediaPipe holistic framework. According to the results, the proposed model not only achieves an impressive overall accuracy of 95%, but also trains more efficiently than comparable algorithms. Particularly in the signer-independent mode, the two-stream approach led to substantial improvements, achieving a maximum accuracy of 92% and a minimum accuracy of 70% with significant increase on the left handed signer accuracy by 37%. The results highlight the effectiveness of the two-stream approach in overcoming challenges related to left and right-hand dominance, which often arise from signer-specific hand dominance. Additionally, the results indicate that, the proposed model can have a positive impact on limited computational resources while also enhancing the model’s overall performance.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200513"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Two stream GRU model with ELU activation function for sign language recognition\",\"authors\":\"Kasian Myagila , Devotha Godfrey Nyambo , Mussa Ally Dida\",\"doi\":\"10.1016/j.iswa.2025.200513\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Pose Estimation features have been successfully used in human activity recognition including sign language recognition. One of the key challenges in sign language recognition is handling signer-independent modes and hand dominance of signer. This paper proposes the use of the Gated Recurrent Unit (GRU) with the ELU activation function to improve computation efficiency and to enhance model learning efficiency. In addition, the paper proposes two stream model architecture to address the challenge of left and right-hand dominance. The study developed model using a Tanzania Sign language datasets collected using mobile devices and extracted pose estimation feature using MediaPipe holistic framework. According to the results, the proposed model not only achieves an impressive overall accuracy of 95%, but also trains more efficiently than comparable algorithms. Particularly in the signer-independent mode, the two-stream approach led to substantial improvements, achieving a maximum accuracy of 92% and a minimum accuracy of 70% with significant increase on the left handed signer accuracy by 37%. The results highlight the effectiveness of the two-stream approach in overcoming challenges related to left and right-hand dominance, which often arise from signer-specific hand dominance. Additionally, the results indicate that, the proposed model can have a positive impact on limited computational resources while also enhancing the model’s overall performance.</div></div>\",\"PeriodicalId\":100684,\"journal\":{\"name\":\"Intelligent Systems with Applications\",\"volume\":\"26 \",\"pages\":\"Article 200513\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligent Systems with Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667305325000390\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Systems with Applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667305325000390","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

姿态估计特征已经成功地应用于人类活动识别，包括手语识别。手语识别的关键挑战之一是如何处理手语独立模式和手势优势。本文提出使用门控循环单元（GRU）和ELU激活函数来提高计算效率和模型学习效率。此外，本文还提出了两种流模型架构来解决左侧和右侧优势的挑战。该研究使用移动设备收集的坦桑尼亚手语数据集开发模型，并使用MediaPipe整体框架提取姿态估计特征。结果表明，该模型不仅达到了令人印象深刻的95%的总体准确率，而且比同类算法的训练效率更高。特别是在独立于签字人的模式下，两流方法带来了实质性的改进，实现了92%的最高准确率和70%的最低准确率，左手签字人的准确率显著提高了37%。结果强调了双流方法在克服与左手和右手优势相关的挑战方面的有效性，这些挑战通常来自于手语特定的手优势。此外，结果表明，所提出的模型可以在有限的计算资源上产生积极的影响，同时也提高了模型的整体性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Two stream GRU model with ELU activation function for sign language recognition

Pose Estimation features have been successfully used in human activity recognition including sign language recognition. One of the key challenges in sign language recognition is handling signer-independent modes and hand dominance of signer. This paper proposes the use of the Gated Recurrent Unit (GRU) with the ELU activation function to improve computation efficiency and to enhance model learning efficiency. In addition, the paper proposes two stream model architecture to address the challenge of left and right-hand dominance. The study developed model using a Tanzania Sign language datasets collected using mobile devices and extracted pose estimation feature using MediaPipe holistic framework. According to the results, the proposed model not only achieves an impressive overall accuracy of 95%, but also trains more efficiently than comparable algorithms. Particularly in the signer-independent mode, the two-stream approach led to substantial improvements, achieving a maximum accuracy of 92% and a minimum accuracy of 70% with significant increase on the left handed signer accuracy by 37%. The results highlight the effectiveness of the two-stream approach in overcoming challenges related to left and right-hand dominance, which often arise from signer-specific hand dominance. Additionally, the results indicate that, the proposed model can have a positive impact on limited computational resources while also enhancing the model’s overall performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Intelligent Systems with Applications

CiteScore

5.60

自引率

0.00%

发文量