IsharaNet: A robust nested feature fusion coupled with attention incorporated width scaled lightweight architecture for Bengali sign language recognition

IF 4.5 Q2 COMPUTER SCIENCE, THEORY & METHODS
Array Pub Date : 2025-08-11 DOI:10.1016/j.array.2025.100486
Md Hasib Al Muzdadid Haque Himel, Md. Al Mehedi Hasan
{"title":"IsharaNet: A robust nested feature fusion coupled with attention incorporated width scaled lightweight architecture for Bengali sign language recognition","authors":"Md Hasib Al Muzdadid Haque Himel,&nbsp;Md. Al Mehedi Hasan","doi":"10.1016/j.array.2025.100486","DOIUrl":null,"url":null,"abstract":"<div><div>Communication between ordinary and speech-hearing impaired people who interact mostly via sign language is one of the most significant challenges nowadays. For any class of individuals who try, learning and communicating with sign language is a difficult endeavor. Research on sign language recognition regarding various languages has been a long-standing concern, and several automated systems that have been proposed as a consequence have not yet proved to be particularly effective for Bengali, which has a wide vocabulary, character set, and expressive techniques, making it one of the most difficult sign languages. In this paper, a lightweight deep neural network architecture (IsharaNet) is proposed that incorporates parallel convolutional operations in order to yield a width scaled architecture in which nested feature fusion coupled with attention is leveraged. To enhance the network’s speed, the architecture has featured additional dropout layers and ReLU activation function. To evaluate the performance of the proposed architecture, the four most recently available Bengali sign language datasets, BdSL47, BdSLW-11, Shongket, and KU-BdSL, were employed. The highest Accuracy, F1-score, and AUC score reached 99.85%, 99.85%, and 0.999 in recognizing Bengali sign numerals. In recognizing Bengali sign alphabet, the highest Accuracy, F1-score, and AUC score reached 99.77%, 99.77%, and 0.999. The proposed architecture recognized Bengali sign words with the highest Accuracy of 99.09%, F1-score of 99.09%, and AUC score of 0.999. By demonstrating superior performance than other methods, the experimental findings indicate that the proposed architecture can be considered for simple and automated Bengali sign language recognition system.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"27 ","pages":"Article 100486"},"PeriodicalIF":4.5000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Array","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590005625001134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Communication between ordinary and speech-hearing impaired people who interact mostly via sign language is one of the most significant challenges nowadays. For any class of individuals who try, learning and communicating with sign language is a difficult endeavor. Research on sign language recognition regarding various languages has been a long-standing concern, and several automated systems that have been proposed as a consequence have not yet proved to be particularly effective for Bengali, which has a wide vocabulary, character set, and expressive techniques, making it one of the most difficult sign languages. In this paper, a lightweight deep neural network architecture (IsharaNet) is proposed that incorporates parallel convolutional operations in order to yield a width scaled architecture in which nested feature fusion coupled with attention is leveraged. To enhance the network’s speed, the architecture has featured additional dropout layers and ReLU activation function. To evaluate the performance of the proposed architecture, the four most recently available Bengali sign language datasets, BdSL47, BdSLW-11, Shongket, and KU-BdSL, were employed. The highest Accuracy, F1-score, and AUC score reached 99.85%, 99.85%, and 0.999 in recognizing Bengali sign numerals. In recognizing Bengali sign alphabet, the highest Accuracy, F1-score, and AUC score reached 99.77%, 99.77%, and 0.999. The proposed architecture recognized Bengali sign words with the highest Accuracy of 99.09%, F1-score of 99.09%, and AUC score of 0.999. By demonstrating superior performance than other methods, the experimental findings indicate that the proposed architecture can be considered for simple and automated Bengali sign language recognition system.
IsharaNet:一个强大的嵌套特征融合,结合了关注宽度缩放的孟加拉手语识别轻量级架构
正常人和言语听力障碍者之间的交流主要通过手语进行,这是当今最重大的挑战之一。对于任何尝试用手语学习和交流的人来说,学习和交流手语都是一项困难的努力。关于各种语言的手语识别研究一直是一个长期关注的问题,因此已经提出的几个自动化系统尚未被证明对孟加拉语特别有效,因为孟加拉语具有广泛的词汇,字符集和表达技巧,使其成为最困难的手语之一。本文提出了一种轻量级的深度神经网络架构(IsharaNet),该架构结合了并行卷积运算,以产生一个宽度缩放的架构,其中嵌套特征融合与注意力相结合。为了提高网络的速度,该架构具有额外的dropout层和ReLU激活功能。为了评估所提出的体系结构的性能,我们使用了四个最新可用的孟加拉语手语数据集:BdSL47、BdSLW-11、Shongket和KU-BdSL。识别孟加拉符号数字的准确率、f1得分和AUC得分分别达到99.85%、99.85%和0.999。在识别孟加拉手语字母时,准确率、f1得分和AUC得分最高分别达到99.77%、99.77%和0.999。所提出的体系结构对孟加拉语手语词的识别准确率最高,达到99.09%,f1得分为99.09%,AUC得分为0.999。实验结果表明,该方法具有较好的性能,可用于简单、自动化的孟加拉语手语识别系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Array
Array Computer Science-General Computer Science
CiteScore
4.40
自引率
0.00%
发文量
93
审稿时长
45 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信