Hand Gesture Recognition for Multi-Culture Sign Language Using Graph and General Deep Learning Network

Abu Saleh Musa Miah;Md. Al Mehedi Hasan;Yoichi Tomioka;Jungpil Shin
{"title":"Hand Gesture Recognition for Multi-Culture Sign Language Using Graph and General Deep Learning Network","authors":"Abu Saleh Musa Miah;Md. Al Mehedi Hasan;Yoichi Tomioka;Jungpil Shin","doi":"10.1109/OJCS.2024.3370971","DOIUrl":null,"url":null,"abstract":"Hand gesture-based Sign Language Recognition (SLR) serves as a crucial communication bridge between hard of hearing and non-deaf individuals. The absence of a universal sign language (SL) leads to diverse nationalities having various cultural SLs, such as Korean, American, and Japanese sign language. Existing SLR systems perform well for their cultural SL but may struggle with other or multi-cultural sign languages (McSL). To address these challenges, this paper introduces a novel end-to-end SLR system called GmTC, designed to translate McSL into equivalent text for enhanced understanding. Here, we employed a Graph and General deep-learning network as two stream modules to extract effective features. In the first stream, produce a graph-based feature by taking advantage of the superpixel values and the graph convolutional network (GCN), aiming to extract distance-based complex relationship features among the superpixel. In the second stream, we extracted long-range and short-range dependency features using attention-based contextual information that passes through multi-stage, multi-head self-attention (MHSA), and CNN modules. Combining these features generates final features that feed into the classification module. Extensive experiments with five culture SL datasets with high-performance accuracy compared to existing state-of-the-art models in individual domains affirming superiority and generalizability.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"5 ","pages":"144-155"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10452793","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Computer Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10452793/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Hand gesture-based Sign Language Recognition (SLR) serves as a crucial communication bridge between hard of hearing and non-deaf individuals. The absence of a universal sign language (SL) leads to diverse nationalities having various cultural SLs, such as Korean, American, and Japanese sign language. Existing SLR systems perform well for their cultural SL but may struggle with other or multi-cultural sign languages (McSL). To address these challenges, this paper introduces a novel end-to-end SLR system called GmTC, designed to translate McSL into equivalent text for enhanced understanding. Here, we employed a Graph and General deep-learning network as two stream modules to extract effective features. In the first stream, produce a graph-based feature by taking advantage of the superpixel values and the graph convolutional network (GCN), aiming to extract distance-based complex relationship features among the superpixel. In the second stream, we extracted long-range and short-range dependency features using attention-based contextual information that passes through multi-stage, multi-head self-attention (MHSA), and CNN modules. Combining these features generates final features that feed into the classification module. Extensive experiments with five culture SL datasets with high-performance accuracy compared to existing state-of-the-art models in individual domains affirming superiority and generalizability.
利用图谱和通用深度学习网络识别多文化手语的手势
基于手势的手语识别(SLR)是连接重听者和非聋人的重要沟通桥梁。由于缺乏通用手语(SL),不同民族有不同的文化手语,如韩国手语、美国手语和日本手语。现有的 SLR 系统在处理其文化手语时表现良好,但在处理其他手语或多文化手语 (McSL) 时可能会遇到困难。为了应对这些挑战,本文介绍了一种名为 GmTC 的新型端到端 SLR 系统,旨在将 McSL 翻译成等效文本,以增强理解能力。在这里,我们采用了图形和通用深度学习网络作为两个流模块来提取有效的特征。在第一个数据流中,利用超像素值和图卷积网络(GCN)生成基于图的特征,旨在提取超像素之间基于距离的复杂关系特征。在第二个流程中,我们通过多阶段、多头自我注意(MHSA)和 CNN 模块,利用基于注意力的上下文信息提取长程和短程依赖关系特征。将这些特征组合起来,就能生成最终特征,并将其输入分类模块。在五个文化SL数据集上进行了广泛的实验,与各个领域现有的先进模型相比,其准确性表现出色,这肯定了其优越性和通用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
12.60
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信