Handwritten Geez Digit Recognition Using Deep Learning

Appl. Comput. Intell. Soft Comput. Pub Date : 2022-11-08 DOI:10.1155/2022/8515810

Mukerem Ali Nur, Mesfin Abebe, Rajesh Sharma Rajendran

{"title":"Handwritten Geez Digit Recognition Using Deep Learning","authors":"Mukerem Ali Nur, Mesfin Abebe, Rajesh Sharma Rajendran","doi":"10.1155/2022/8515810","DOIUrl":null,"url":null,"abstract":"Amharic language is the second most spoken language in the Semitic family after Arabic. In Ethiopia and neighboring countries more than 100 million people speak the Amharic language. There are many historical documents that are written using the Geez script. Digitizing historical handwritten documents and recognizing handwritten characters is essential to preserving valuable documents. Handwritten digit recognition is one of the tasks of digitizing handwritten documents from different sources. Currently, handwritten Geez digit recognition researches are very few, and there is no available organized dataset for the public researchers. Convolutional neural network (CNN) is preferable for pattern recognition like in handwritten document recognition by extracting a feature from different styles of writing. In this work, the proposed model is to recognize Geez digits using CNN. Deep neural networks, which have recently shown exceptional performance in numerous pattern recognition and machine learning applications, are used to recognize handwritten Geez digits, but this has not been attempted for Ethiopic scripts. Our dataset, which contains 51,952 images of handwritten Geez digits collected from 524 individuals, is used to train and evaluate the CNN model. The application of the CNN improves the performance of several machine-learning classification methods significantly. Our proposed CNN model has an accuracy of 96.21% and a loss of 0.2013. In comparison to earlier research works on Geez handwritten digit recognition, the study was able to attain higher recognition accuracy using the developed CNN model.","PeriodicalId":8218,"journal":{"name":"Appl. Comput. Intell. Soft Comput.","volume":"502 1","pages":"8515810:1-8515810:12"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Appl. Comput. Intell. Soft Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2022/8515810","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Amharic language is the second most spoken language in the Semitic family after Arabic. In Ethiopia and neighboring countries more than 100 million people speak the Amharic language. There are many historical documents that are written using the Geez script. Digitizing historical handwritten documents and recognizing handwritten characters is essential to preserving valuable documents. Handwritten digit recognition is one of the tasks of digitizing handwritten documents from different sources. Currently, handwritten Geez digit recognition researches are very few, and there is no available organized dataset for the public researchers. Convolutional neural network (CNN) is preferable for pattern recognition like in handwritten document recognition by extracting a feature from different styles of writing. In this work, the proposed model is to recognize Geez digits using CNN. Deep neural networks, which have recently shown exceptional performance in numerous pattern recognition and machine learning applications, are used to recognize handwritten Geez digits, but this has not been attempted for Ethiopic scripts. Our dataset, which contains 51,952 images of handwritten Geez digits collected from 524 individuals, is used to train and evaluate the CNN model. The application of the CNN improves the performance of several machine-learning classification methods significantly. Our proposed CNN model has an accuracy of 96.21% and a loss of 0.2013. In comparison to earlier research works on Geez handwritten digit recognition, the study was able to attain higher recognition accuracy using the developed CNN model.

查看原文本刊更多论文

使用深度学习的手写Geez数字识别

阿姆哈拉语是闪米特族中仅次于阿拉伯语的第二大语言。在埃塞俄比亚及其邻国，有超过1亿人说阿姆哈拉语。有许多历史文献都是用耶兹文字写成的。数字化历史手写文件和识别手写字符是保存有价值文件的必要条件。手写数字识别是对不同来源的手写文档进行数字化处理的任务之一。目前，手写体Geez数字识别的研究很少，也没有可供公众研究的有组织的数据集。卷积神经网络(CNN)通过从不同的写作风格中提取特征，更适合于模式识别，比如手写文档识别。在这项工作中，提出的模型是使用CNN识别Geez数字。深度神经网络最近在许多模式识别和机器学习应用中表现出色，用于识别手写的Geez数字，但尚未尝试识别埃塞俄比亚文字。我们的数据集包含51952张来自524个人的手写Geez数字图像，用于训练和评估CNN模型。CNN的应用显著提高了几种机器学习分类方法的性能。我们提出的CNN模型准确率为96.21%，损失为0.2013。与早期对Geez手写数字识别的研究工作相比，该研究使用开发的CNN模型能够获得更高的识别精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Appl. Comput. Intell. Soft Comput.

自引率

0.00%

发文量