Touseef Saleh Bin Ahmed , Tawhidur Rahman , Shammo Biswas , Saifur Rahman Sabuj , Mohammed Belal Bhuian , Mohammad Ali Moni , Md Ashraful Alam
{"title":"一种基于视觉变换的混合神经结构,用于自动手写体孟加拉文字符识别和盲文转换","authors":"Touseef Saleh Bin Ahmed , Tawhidur Rahman , Shammo Biswas , Saifur Rahman Sabuj , Mohammed Belal Bhuian , Mohammad Ali Moni , Md Ashraful Alam","doi":"10.1016/j.knosys.2025.114546","DOIUrl":null,"url":null,"abstract":"<div><div>The rapid advancement of technology has led to notable changes in the current educational system. Nevertheless, there are still relatively few assisting aids that can help in teaching individuals with disabilities, such as those who are blind or visually impaired. An effective teaching strategy for those who are blind or visually impaired is braille. Although it has been digitized to produce an electronic version, handwritten characters are not considered in those versions. Studies on English character recognition have shown high accuracy, which is not the case with Bangla character recognition. We present an automated system that converts handwritten Bangla characters to braille using novel hybrid deep neural network architectures. Our approach begins with a Character Quality Assessment Framework (CQAF), which employs adaptive thresholds and comprehensive quality metrics designed explicitly for Bangla script characteristics. Building upon this foundation, we present two architectures. HybridNet-L represents our initial multi-stream design, while HybridNet-S is a redesigned lightweight variant that reduces parameters and achieves superior accuracy, making it the primary contribution of this work. To complete the system, we implement a comprehensive accessibility solution featuring real-time braille hardware interface and text-to-speech capabilities. The model effectively processes all 84 Bangla character classes including vowels, consonants, numerics, and compound characters. Extensive evaluation against seven baseline models demonstrates that our HybridNet-S achieves superior performance with 95.80% validation accuracy while maintaining computational efficiency suitable for embedded deployment. Statistical validation and ablation studies confirm the robustness and effectiveness of our multi-stream architecture for practical assistive technology applications.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114546"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A vision transformer-based hybrid neural architecture for automated handwritten Bangla character recognition and braille conversion\",\"authors\":\"Touseef Saleh Bin Ahmed , Tawhidur Rahman , Shammo Biswas , Saifur Rahman Sabuj , Mohammed Belal Bhuian , Mohammad Ali Moni , Md Ashraful Alam\",\"doi\":\"10.1016/j.knosys.2025.114546\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The rapid advancement of technology has led to notable changes in the current educational system. Nevertheless, there are still relatively few assisting aids that can help in teaching individuals with disabilities, such as those who are blind or visually impaired. An effective teaching strategy for those who are blind or visually impaired is braille. Although it has been digitized to produce an electronic version, handwritten characters are not considered in those versions. Studies on English character recognition have shown high accuracy, which is not the case with Bangla character recognition. We present an automated system that converts handwritten Bangla characters to braille using novel hybrid deep neural network architectures. Our approach begins with a Character Quality Assessment Framework (CQAF), which employs adaptive thresholds and comprehensive quality metrics designed explicitly for Bangla script characteristics. Building upon this foundation, we present two architectures. HybridNet-L represents our initial multi-stream design, while HybridNet-S is a redesigned lightweight variant that reduces parameters and achieves superior accuracy, making it the primary contribution of this work. To complete the system, we implement a comprehensive accessibility solution featuring real-time braille hardware interface and text-to-speech capabilities. The model effectively processes all 84 Bangla character classes including vowels, consonants, numerics, and compound characters. Extensive evaluation against seven baseline models demonstrates that our HybridNet-S achieves superior performance with 95.80% validation accuracy while maintaining computational efficiency suitable for embedded deployment. Statistical validation and ablation studies confirm the robustness and effectiveness of our multi-stream architecture for practical assistive technology applications.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"330 \",\"pages\":\"Article 114546\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705125015850\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125015850","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A vision transformer-based hybrid neural architecture for automated handwritten Bangla character recognition and braille conversion
The rapid advancement of technology has led to notable changes in the current educational system. Nevertheless, there are still relatively few assisting aids that can help in teaching individuals with disabilities, such as those who are blind or visually impaired. An effective teaching strategy for those who are blind or visually impaired is braille. Although it has been digitized to produce an electronic version, handwritten characters are not considered in those versions. Studies on English character recognition have shown high accuracy, which is not the case with Bangla character recognition. We present an automated system that converts handwritten Bangla characters to braille using novel hybrid deep neural network architectures. Our approach begins with a Character Quality Assessment Framework (CQAF), which employs adaptive thresholds and comprehensive quality metrics designed explicitly for Bangla script characteristics. Building upon this foundation, we present two architectures. HybridNet-L represents our initial multi-stream design, while HybridNet-S is a redesigned lightweight variant that reduces parameters and achieves superior accuracy, making it the primary contribution of this work. To complete the system, we implement a comprehensive accessibility solution featuring real-time braille hardware interface and text-to-speech capabilities. The model effectively processes all 84 Bangla character classes including vowels, consonants, numerics, and compound characters. Extensive evaluation against seven baseline models demonstrates that our HybridNet-S achieves superior performance with 95.80% validation accuracy while maintaining computational efficiency suitable for embedded deployment. Statistical validation and ablation studies confirm the robustness and effectiveness of our multi-stream architecture for practical assistive technology applications.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.