Bangla handwritten word recognition using YOLO V5

Md. Anwar Hossain, A. Abadin, Md. Omar Faruk, Iffat Ara, Mirza Afm Rashidul Hasan, Nafiul Fatta, Md Asraful, Ebrahim Hossen
{"title":"Bangla handwritten word recognition using YOLO V5","authors":"Md. Anwar Hossain, A. Abadin, Md. Omar Faruk, Iffat Ara, Mirza Afm Rashidul Hasan, Nafiul Fatta, Md Asraful, Ebrahim Hossen","doi":"10.11591/eei.v13i3.6953","DOIUrl":null,"url":null,"abstract":"This research paper presents an innovative solution for offline handwritten word recognition in Bengali, a prominent Indic language. The complexities of this script, particularly in cursive writing, often lead to overlapping characters and segmentation challenges. Conventional methodologies, reliant on individual character recognition and aggregation, are error-prone. To overcome these limitations, we propose a novel method treating the entire document as a coherent entity and utilizing the efficient you only look once (YOLO) model for word extraction. In our approach, we view individual words as distinct objects and employ the YOLO model for supervised learning, transforming object detection into a regression problematic to predict spatially detached bounding boxes and class possibilities. Rigorous training results in outstanding performance, with remarkable box_loss of 0.014, obj_loss of 0.14, and class_loss of 0.009. Furthermore, the achieved mAP_0.5 score of 0.95 and map_0.5:0.95 score of 0.97 demonstrates the model’s exceptional accuracy in detecting and recognizing handwritten words. To evaluate our method comprehensively, we introduce the Omor-Ekush dataset, a meticulously curated collection of 21,300 handwritten words from 150 participants, featuring 141 words per document. Our pioneering YOLO-based approach, combined with the curated Omor-Ekush dataset, represents a significant advancement in handwritten word recognition in Bengali.","PeriodicalId":502860,"journal":{"name":"Bulletin of Electrical Engineering and Informatics","volume":"32 7","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bulletin of Electrical Engineering and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/eei.v13i3.6953","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This research paper presents an innovative solution for offline handwritten word recognition in Bengali, a prominent Indic language. The complexities of this script, particularly in cursive writing, often lead to overlapping characters and segmentation challenges. Conventional methodologies, reliant on individual character recognition and aggregation, are error-prone. To overcome these limitations, we propose a novel method treating the entire document as a coherent entity and utilizing the efficient you only look once (YOLO) model for word extraction. In our approach, we view individual words as distinct objects and employ the YOLO model for supervised learning, transforming object detection into a regression problematic to predict spatially detached bounding boxes and class possibilities. Rigorous training results in outstanding performance, with remarkable box_loss of 0.014, obj_loss of 0.14, and class_loss of 0.009. Furthermore, the achieved mAP_0.5 score of 0.95 and map_0.5:0.95 score of 0.97 demonstrates the model’s exceptional accuracy in detecting and recognizing handwritten words. To evaluate our method comprehensively, we introduce the Omor-Ekush dataset, a meticulously curated collection of 21,300 handwritten words from 150 participants, featuring 141 words per document. Our pioneering YOLO-based approach, combined with the curated Omor-Ekush dataset, represents a significant advancement in handwritten word recognition in Bengali.
使用 YOLO V5 识别孟加拉语手写单词
本研究论文提出了一种创新解决方案,用于离线识别孟加拉语(一种著名的印度语言)中的手写单词。这种文字的复杂性,尤其是在草书书写中,经常导致字符重叠和分割难题。传统方法依赖于单个字符识别和聚合,容易出错。为了克服这些局限性,我们提出了一种新颖的方法,将整个文档视为一个连贯的实体,并利用高效的 "只看一遍"(YOLO)模型进行单词提取。在我们的方法中,我们将单个词视为不同的对象,并利用 YOLO 模型进行监督学习,将对象检测转化为回归问题,以预测空间上分离的边界框和类的可能性。严格的训练带来了出色的性能,显著的 box_loss 为 0.014,obj_loss 为 0.14,class_loss 为 0.009。此外,mAP_0.5 得分为 0.95,map_0.5:0.95 得分为 0.97,这表明该模型在检测和识别手写单词方面具有极高的准确性。为了全面评估我们的方法,我们引入了 Omor-Ekush 数据集,该数据集经过精心策划,收集了来自 150 名参与者的 21,300 个手写单词,每个文档包含 141 个单词。我们开创的基于 YOLO 的方法与 Omor-Ekush 数据集相结合,代表了孟加拉语手写单词识别领域的重大进步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信