CNN-Based Optical Character Recognition for Isolated Printed Gujarati Characters and Handwritten Numerals

IF 1.5 Q3 ENGINEERING, MULTIDISCIPLINARY
Sanket B. Suthar, Amit Thakkar
{"title":"CNN-Based Optical Character Recognition for Isolated Printed Gujarati Characters and Handwritten Numerals","authors":"Sanket B. Suthar, Amit Thakkar","doi":"10.33889/ijmems.2022.7.5.042","DOIUrl":null,"url":null,"abstract":"Optical character recognition (OCR) technologies have made significant progress in the field of language recognition. Gujarati is a more difficult language to recognize compared to other languages because of curves, close loops, the inclusion of modifiers, and the presence of joint characters. So great effort has been laid into the literature for Gujarati OCR. Recently deep learning-based CNN models are applied to develop OCR for different languages but Convolutional Neural Networks (CNN) models are not yet giving a satisfactory performance to recognize Gujarati characters. So, this paper proposes a revolutionary Gujarati printed characters and numerals recognition CNN models. CNN-PGC (CNN for - Printed Gujarati Character) and CNN-HGC (CNN for - Handwritten Gujarati Character) are two optimally configured Convolutional Neural Networks (CNNs) presented in this research for printed Gujarati base characters and handwritten numbers, respectively. Concerning particular performance indicators, the suggested work's performance is evaluated and proven against that of other traditional models and with the latest baseline methods. Experimental analysis has been carried out on well-segmented newly generated Gujarati base characters and numerals dataset which includes 36 consonants, 13 vowels, and 10 handwritten numerals. Variation in the database is also taken into consideration during experiments like size, skew, noise blue, etc. Even in the presence of printing irregularities, writing irregularities, and degradations the proposed method achieves a 98.08% recognition rate for print characters and a 95.24 % recognition rate for handwritten numerals which is better than other existing models.","PeriodicalId":44185,"journal":{"name":"International Journal of Mathematical Engineering and Management Sciences","volume":" ","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Mathematical Engineering and Management Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33889/ijmems.2022.7.5.042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Optical character recognition (OCR) technologies have made significant progress in the field of language recognition. Gujarati is a more difficult language to recognize compared to other languages because of curves, close loops, the inclusion of modifiers, and the presence of joint characters. So great effort has been laid into the literature for Gujarati OCR. Recently deep learning-based CNN models are applied to develop OCR for different languages but Convolutional Neural Networks (CNN) models are not yet giving a satisfactory performance to recognize Gujarati characters. So, this paper proposes a revolutionary Gujarati printed characters and numerals recognition CNN models. CNN-PGC (CNN for - Printed Gujarati Character) and CNN-HGC (CNN for - Handwritten Gujarati Character) are two optimally configured Convolutional Neural Networks (CNNs) presented in this research for printed Gujarati base characters and handwritten numbers, respectively. Concerning particular performance indicators, the suggested work's performance is evaluated and proven against that of other traditional models and with the latest baseline methods. Experimental analysis has been carried out on well-segmented newly generated Gujarati base characters and numerals dataset which includes 36 consonants, 13 vowels, and 10 handwritten numerals. Variation in the database is also taken into consideration during experiments like size, skew, noise blue, etc. Even in the presence of printing irregularities, writing irregularities, and degradations the proposed method achieves a 98.08% recognition rate for print characters and a 95.24 % recognition rate for handwritten numerals which is better than other existing models.
基于cnn的古吉拉特孤立印刷字符和手写数字光学字符识别
光学字符识别(OCR)技术在语言识别领域取得了重大进展。与其他语言相比,古吉拉特语是一种更难以识别的语言,因为它有曲线、闭环、包含修饰符和连接字符的存在。因此,古吉拉特语OCR的文献已经付出了巨大的努力。近年来,基于深度学习的CNN模型被用于开发不同语言的OCR,但卷积神经网络(CNN)模型在古吉拉特语字符识别方面的表现尚不理想。为此,本文提出了一种革命性的古吉拉特文字和数字识别CNN模型。CNN- pgc (CNN for - printing Gujarati Character)和CNN- hgc (CNN for - handwriting Gujarati Character)是本研究中分别针对印刷古吉拉特基本字符和手写数字提出的两种优化配置的卷积神经网络(CNN)。对于特定的绩效指标,建议的工作绩效将根据其他传统模型和最新的基线方法进行评估和证明。对新生成的包含36个辅音、13个元音和10个手写数字的古吉拉特语基本字符和数字数据集进行了实验分析。在实验过程中也考虑到数据库的变化,如大小,倾斜,噪声蓝等。即使在存在印刷不规则、书写不规则和退化的情况下,该方法对印刷字符的识别率为98.08%,对手写数字的识别率为95.24%,优于现有的其他模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.80
自引率
6.20%
发文量
57
审稿时长
20 weeks
期刊介绍: IJMEMS is a peer reviewed international journal aiming on both the theoretical and practical aspects of mathematical, engineering and management sciences. The original, not-previously published, research manuscripts on topics such as the following (but not limited to) will be considered for publication: *Mathematical Sciences- applied mathematics and allied fields, operations research, mathematical statistics. *Engineering Sciences- computer science engineering, mechanical engineering, information technology engineering, civil engineering, aeronautical engineering, industrial engineering, systems engineering, reliability engineering, production engineering. *Management Sciences- engineering management, risk management, business models, supply chain management.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信