Deep Learning Analysis in Development of Handwritten and Plain Text Classification API

Danny Gani, James Purnama, Kho I Eng, M. Galinium, Maria Lamury
{"title":"Deep Learning Analysis in Development of Handwritten and Plain Text Classification API","authors":"Danny Gani, James Purnama, Kho I Eng, M. Galinium, Maria Lamury","doi":"10.1145/3557738.3557852","DOIUrl":null,"url":null,"abstract":"Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) are technologies that enable text recognition. The difference between OCR and HTR is one designed specifically for digital text and one designed for handwritten text. There are already various implementations of OCR and HTR online. However, such systems do not guarantee the systems are in premises. To solve this problem, the OCR and HTR system must be built from the scratch. The purpose of this research is to improve the recognition by separating the text whether it is a handwritten or a printed text, which will later be forwarded into the appropriate recognition system. An application program interface (API) was also created in order to finalize the classification system into real world usage. In this research, the classification system being developed using convolutional neural network (CNN) method. To be able to reach the highest accuracy of the classification system, the experimentation and improvement on hyperparameters, dataset format, data augmentation and analysis on 3 CNN architectures were conducted. In the end of this research, there are 2 architectures in a tight competition on ideal data testing, one is VGG-16 with 90.63% accuracy and one is AlexNet with 90.17% accuracy. However, AlexNet is chosen as the winner after real data testing.","PeriodicalId":178760,"journal":{"name":"Proceedings of the 2022 International Conference on Engineering and Information Technology for Sustainable Industry","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 International Conference on Engineering and Information Technology for Sustainable Industry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3557738.3557852","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) are technologies that enable text recognition. The difference between OCR and HTR is one designed specifically for digital text and one designed for handwritten text. There are already various implementations of OCR and HTR online. However, such systems do not guarantee the systems are in premises. To solve this problem, the OCR and HTR system must be built from the scratch. The purpose of this research is to improve the recognition by separating the text whether it is a handwritten or a printed text, which will later be forwarded into the appropriate recognition system. An application program interface (API) was also created in order to finalize the classification system into real world usage. In this research, the classification system being developed using convolutional neural network (CNN) method. To be able to reach the highest accuracy of the classification system, the experimentation and improvement on hyperparameters, dataset format, data augmentation and analysis on 3 CNN architectures were conducted. In the end of this research, there are 2 architectures in a tight competition on ideal data testing, one is VGG-16 with 90.63% accuracy and one is AlexNet with 90.17% accuracy. However, AlexNet is chosen as the winner after real data testing.
手写体与纯文本分类API开发中的深度学习分析
光学字符识别(OCR)和手写文本识别(HTR)是实现文本识别的技术。OCR和HTR之间的区别是一个专为数字文本设计,另一个专为手写文本设计。网上已经有各种OCR和HTR的实现。但是,这样的系统不能保证系统在房内。为了解决这个问题,必须从头开始构建OCR和HTR系统。本研究的目的是通过分离文本(无论是手写文本还是印刷文本)来提高识别,这些文本随后将被转发到适当的识别系统中。还创建了一个应用程序编程接口(API),以便将分类系统最终确定为实际使用。在本研究中,使用卷积神经网络(CNN)方法开发分类系统。为了达到分类系统的最高准确率,在3种CNN架构上对超参数、数据集格式、数据增强和分析进行了实验和改进。在本研究结束时,在理想数据测试方面有两种架构竞争激烈,一种是准确率为90.63%的VGG-16,另一种是准确率为90.17%的AlexNet。然而,经过真实的数据测试,AlexNet被选为获胜者。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信