Handwritten Geez Digit Recognition Using Deep Learning

Mukerem Ali Nur, Mesfin Abebe, Rajesh Sharma Rajendran
{"title":"Handwritten Geez Digit Recognition Using Deep Learning","authors":"Mukerem Ali Nur, Mesfin Abebe, Rajesh Sharma Rajendran","doi":"10.1155/2022/8515810","DOIUrl":null,"url":null,"abstract":"Amharic language is the second most spoken language in the Semitic family after Arabic. In Ethiopia and neighboring countries more than 100 million people speak the Amharic language. There are many historical documents that are written using the Geez script. Digitizing historical handwritten documents and recognizing handwritten characters is essential to preserving valuable documents. Handwritten digit recognition is one of the tasks of digitizing handwritten documents from different sources. Currently, handwritten Geez digit recognition researches are very few, and there is no available organized dataset for the public researchers. Convolutional neural network (CNN) is preferable for pattern recognition like in handwritten document recognition by extracting a feature from different styles of writing. In this work, the proposed model is to recognize Geez digits using CNN. Deep neural networks, which have recently shown exceptional performance in numerous pattern recognition and machine learning applications, are used to recognize handwritten Geez digits, but this has not been attempted for Ethiopic scripts. Our dataset, which contains 51,952 images of handwritten Geez digits collected from 524 individuals, is used to train and evaluate the CNN model. The application of the CNN improves the performance of several machine-learning classification methods significantly. Our proposed CNN model has an accuracy of 96.21% and a loss of 0.2013. In comparison to earlier research works on Geez handwritten digit recognition, the study was able to attain higher recognition accuracy using the developed CNN model.","PeriodicalId":8218,"journal":{"name":"Appl. Comput. Intell. Soft Comput.","volume":"502 1","pages":"8515810:1-8515810:12"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Appl. Comput. Intell. Soft Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2022/8515810","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Amharic language is the second most spoken language in the Semitic family after Arabic. In Ethiopia and neighboring countries more than 100 million people speak the Amharic language. There are many historical documents that are written using the Geez script. Digitizing historical handwritten documents and recognizing handwritten characters is essential to preserving valuable documents. Handwritten digit recognition is one of the tasks of digitizing handwritten documents from different sources. Currently, handwritten Geez digit recognition researches are very few, and there is no available organized dataset for the public researchers. Convolutional neural network (CNN) is preferable for pattern recognition like in handwritten document recognition by extracting a feature from different styles of writing. In this work, the proposed model is to recognize Geez digits using CNN. Deep neural networks, which have recently shown exceptional performance in numerous pattern recognition and machine learning applications, are used to recognize handwritten Geez digits, but this has not been attempted for Ethiopic scripts. Our dataset, which contains 51,952 images of handwritten Geez digits collected from 524 individuals, is used to train and evaluate the CNN model. The application of the CNN improves the performance of several machine-learning classification methods significantly. Our proposed CNN model has an accuracy of 96.21% and a loss of 0.2013. In comparison to earlier research works on Geez handwritten digit recognition, the study was able to attain higher recognition accuracy using the developed CNN model.
使用深度学习的手写Geez数字识别
阿姆哈拉语是闪米特族中仅次于阿拉伯语的第二大语言。在埃塞俄比亚及其邻国,有超过1亿人说阿姆哈拉语。有许多历史文献都是用耶兹文字写成的。数字化历史手写文件和识别手写字符是保存有价值文件的必要条件。手写数字识别是对不同来源的手写文档进行数字化处理的任务之一。目前,手写体Geez数字识别的研究很少,也没有可供公众研究的有组织的数据集。卷积神经网络(CNN)通过从不同的写作风格中提取特征,更适合于模式识别,比如手写文档识别。在这项工作中,提出的模型是使用CNN识别Geez数字。深度神经网络最近在许多模式识别和机器学习应用中表现出色,用于识别手写的Geez数字,但尚未尝试识别埃塞俄比亚文字。我们的数据集包含51952张来自524个人的手写Geez数字图像,用于训练和评估CNN模型。CNN的应用显著提高了几种机器学习分类方法的性能。我们提出的CNN模型准确率为96.21%,损失为0.2013。与早期对Geez手写数字识别的研究工作相比,该研究使用开发的CNN模型能够获得更高的识别精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信