IsharaNet: A robust nested feature fusion coupled with attention incorporated width scaled lightweight architecture for Bengali sign language recognition
Md Hasib Al Muzdadid Haque Himel, Md. Al Mehedi Hasan
{"title":"IsharaNet: A robust nested feature fusion coupled with attention incorporated width scaled lightweight architecture for Bengali sign language recognition","authors":"Md Hasib Al Muzdadid Haque Himel, Md. Al Mehedi Hasan","doi":"10.1016/j.array.2025.100486","DOIUrl":null,"url":null,"abstract":"<div><div>Communication between ordinary and speech-hearing impaired people who interact mostly via sign language is one of the most significant challenges nowadays. For any class of individuals who try, learning and communicating with sign language is a difficult endeavor. Research on sign language recognition regarding various languages has been a long-standing concern, and several automated systems that have been proposed as a consequence have not yet proved to be particularly effective for Bengali, which has a wide vocabulary, character set, and expressive techniques, making it one of the most difficult sign languages. In this paper, a lightweight deep neural network architecture (IsharaNet) is proposed that incorporates parallel convolutional operations in order to yield a width scaled architecture in which nested feature fusion coupled with attention is leveraged. To enhance the network’s speed, the architecture has featured additional dropout layers and ReLU activation function. To evaluate the performance of the proposed architecture, the four most recently available Bengali sign language datasets, BdSL47, BdSLW-11, Shongket, and KU-BdSL, were employed. The highest Accuracy, F1-score, and AUC score reached 99.85%, 99.85%, and 0.999 in recognizing Bengali sign numerals. In recognizing Bengali sign alphabet, the highest Accuracy, F1-score, and AUC score reached 99.77%, 99.77%, and 0.999. The proposed architecture recognized Bengali sign words with the highest Accuracy of 99.09%, F1-score of 99.09%, and AUC score of 0.999. By demonstrating superior performance than other methods, the experimental findings indicate that the proposed architecture can be considered for simple and automated Bengali sign language recognition system.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"27 ","pages":"Article 100486"},"PeriodicalIF":4.5000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Array","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590005625001134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Communication between ordinary and speech-hearing impaired people who interact mostly via sign language is one of the most significant challenges nowadays. For any class of individuals who try, learning and communicating with sign language is a difficult endeavor. Research on sign language recognition regarding various languages has been a long-standing concern, and several automated systems that have been proposed as a consequence have not yet proved to be particularly effective for Bengali, which has a wide vocabulary, character set, and expressive techniques, making it one of the most difficult sign languages. In this paper, a lightweight deep neural network architecture (IsharaNet) is proposed that incorporates parallel convolutional operations in order to yield a width scaled architecture in which nested feature fusion coupled with attention is leveraged. To enhance the network’s speed, the architecture has featured additional dropout layers and ReLU activation function. To evaluate the performance of the proposed architecture, the four most recently available Bengali sign language datasets, BdSL47, BdSLW-11, Shongket, and KU-BdSL, were employed. The highest Accuracy, F1-score, and AUC score reached 99.85%, 99.85%, and 0.999 in recognizing Bengali sign numerals. In recognizing Bengali sign alphabet, the highest Accuracy, F1-score, and AUC score reached 99.77%, 99.77%, and 0.999. The proposed architecture recognized Bengali sign words with the highest Accuracy of 99.09%, F1-score of 99.09%, and AUC score of 0.999. By demonstrating superior performance than other methods, the experimental findings indicate that the proposed architecture can be considered for simple and automated Bengali sign language recognition system.