使用CNN的Ranjana Script手写字符识别

Q3 Decision Sciences
Jen Bati, Pankaj Raj Dawadi
{"title":"使用CNN的Ranjana Script手写字符识别","authors":"Jen Bati, Pankaj Raj Dawadi","doi":"10.30630/joiv.7.3.1725","DOIUrl":null,"url":null,"abstract":"This paper proposes a public image database for Ranjana script Handwritten Character Datasets (RHCD), publicly available for Ranjana script researchers or anyone interested in the subject. To the best of our knowledge, the Ranjana script Handwritten Character Dataset (RHCD) is the first publicly available database for Ranjana script researchers. Ranjana script descended from the Brahmi script, consists of 36 consonant letters, 16 vowel letters, and 10 numerical letters. The focus of this research is three-fold: the first is to create a new database for Ranjana script Handwritten Character Recognition; the second is to test the character recognition accuracy of the created RHCD using existing CNN algorithms like LeNET-5, AlexNET, and ZFNET algorithm; the third is to propose a model by investigating different hyper-tuning parameters to improve the recognition accuracy of the created RHCD. The research method applied in this study is dataset collection, digitization & cropping, pre-processing, dataset splitting, data augmentation, and finally, implementing the CNN model (existing and proposed). Performance evaluation is based on the test accuracy, precision, recall, and F1-score. The experiment result shows that our model ranks first, with a testing accuracy of 99.73% for 64x64 pixels resolution with precision, recall, and F1-score value 1. Creation and recognition of Ranjana script characters, vowel modifiers, and compound characters can be the next milestone to be achieved. Segmentation of words and sentences into characters and recognizing each character individually can be the next research domain.","PeriodicalId":32468,"journal":{"name":"JOIV International Journal on Informatics Visualization","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Ranjana Script Handwritten Character Recognition using CNN\",\"authors\":\"Jen Bati, Pankaj Raj Dawadi\",\"doi\":\"10.30630/joiv.7.3.1725\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes a public image database for Ranjana script Handwritten Character Datasets (RHCD), publicly available for Ranjana script researchers or anyone interested in the subject. To the best of our knowledge, the Ranjana script Handwritten Character Dataset (RHCD) is the first publicly available database for Ranjana script researchers. Ranjana script descended from the Brahmi script, consists of 36 consonant letters, 16 vowel letters, and 10 numerical letters. The focus of this research is three-fold: the first is to create a new database for Ranjana script Handwritten Character Recognition; the second is to test the character recognition accuracy of the created RHCD using existing CNN algorithms like LeNET-5, AlexNET, and ZFNET algorithm; the third is to propose a model by investigating different hyper-tuning parameters to improve the recognition accuracy of the created RHCD. The research method applied in this study is dataset collection, digitization & cropping, pre-processing, dataset splitting, data augmentation, and finally, implementing the CNN model (existing and proposed). Performance evaluation is based on the test accuracy, precision, recall, and F1-score. The experiment result shows that our model ranks first, with a testing accuracy of 99.73% for 64x64 pixels resolution with precision, recall, and F1-score value 1. Creation and recognition of Ranjana script characters, vowel modifiers, and compound characters can be the next milestone to be achieved. Segmentation of words and sentences into characters and recognizing each character individually can be the next research domain.\",\"PeriodicalId\":32468,\"journal\":{\"name\":\"JOIV International Journal on Informatics Visualization\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JOIV International Journal on Informatics Visualization\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.30630/joiv.7.3.1725\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Decision Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JOIV International Journal on Informatics Visualization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30630/joiv.7.3.1725","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Decision Sciences","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一个Ranjana手写体字符数据集(RHCD)的公共图像数据库,可供Ranjana手写体研究人员或任何对该主题感兴趣的人公开使用。据我们所知,Ranjana手写体字符数据集(RHCD)是Ranjana手写体研究人员第一个公开可用的数据库。兰迦那文字源自婆罗门文字,由36个辅音字母,16个元音字母和10个数字字母组成。本研究的重点有三个方面:一是创建一个新的Ranjana手写体字符识别数据库;二是使用LeNET-5、AlexNET、ZFNET等现有CNN算法测试所创建的RHCD的字符识别精度;第三,通过研究不同的超调谐参数,提出了一个模型,以提高所创建的RHCD的识别精度。本研究采用的研究方法是:数据收集、数字化;裁剪、预处理、数据集分割、数据增强,最后实现CNN模型(现有的和提出的)。性能评估是基于测试的准确性、精密度、召回率和f1分数。实验结果表明,我们的模型在64 × 64像素分辨率下的测试准确率为99.73%,精度、召回率和F1-score值为1。创造和识别兰加纳文字、元音修饰语和复合字可以成为下一个要实现的里程碑。将单词和句子分割成字符并单独识别每个字符可能是下一个研究领域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Ranjana Script Handwritten Character Recognition using CNN
This paper proposes a public image database for Ranjana script Handwritten Character Datasets (RHCD), publicly available for Ranjana script researchers or anyone interested in the subject. To the best of our knowledge, the Ranjana script Handwritten Character Dataset (RHCD) is the first publicly available database for Ranjana script researchers. Ranjana script descended from the Brahmi script, consists of 36 consonant letters, 16 vowel letters, and 10 numerical letters. The focus of this research is three-fold: the first is to create a new database for Ranjana script Handwritten Character Recognition; the second is to test the character recognition accuracy of the created RHCD using existing CNN algorithms like LeNET-5, AlexNET, and ZFNET algorithm; the third is to propose a model by investigating different hyper-tuning parameters to improve the recognition accuracy of the created RHCD. The research method applied in this study is dataset collection, digitization & cropping, pre-processing, dataset splitting, data augmentation, and finally, implementing the CNN model (existing and proposed). Performance evaluation is based on the test accuracy, precision, recall, and F1-score. The experiment result shows that our model ranks first, with a testing accuracy of 99.73% for 64x64 pixels resolution with precision, recall, and F1-score value 1. Creation and recognition of Ranjana script characters, vowel modifiers, and compound characters can be the next milestone to be achieved. Segmentation of words and sentences into characters and recognizing each character individually can be the next research domain.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
JOIV International Journal on Informatics Visualization
JOIV International Journal on Informatics Visualization Decision Sciences-Information Systems and Management
CiteScore
1.40
自引率
0.00%
发文量
100
审稿时长
16 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信