Danny Gani, James Purnama, Kho I Eng, M. Galinium, Maria Lamury
{"title":"手写体与纯文本分类API开发中的深度学习分析","authors":"Danny Gani, James Purnama, Kho I Eng, M. Galinium, Maria Lamury","doi":"10.1145/3557738.3557852","DOIUrl":null,"url":null,"abstract":"Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) are technologies that enable text recognition. The difference between OCR and HTR is one designed specifically for digital text and one designed for handwritten text. There are already various implementations of OCR and HTR online. However, such systems do not guarantee the systems are in premises. To solve this problem, the OCR and HTR system must be built from the scratch. The purpose of this research is to improve the recognition by separating the text whether it is a handwritten or a printed text, which will later be forwarded into the appropriate recognition system. An application program interface (API) was also created in order to finalize the classification system into real world usage. In this research, the classification system being developed using convolutional neural network (CNN) method. To be able to reach the highest accuracy of the classification system, the experimentation and improvement on hyperparameters, dataset format, data augmentation and analysis on 3 CNN architectures were conducted. In the end of this research, there are 2 architectures in a tight competition on ideal data testing, one is VGG-16 with 90.63% accuracy and one is AlexNet with 90.17% accuracy. However, AlexNet is chosen as the winner after real data testing.","PeriodicalId":178760,"journal":{"name":"Proceedings of the 2022 International Conference on Engineering and Information Technology for Sustainable Industry","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Learning Analysis in Development of Handwritten and Plain Text Classification API\",\"authors\":\"Danny Gani, James Purnama, Kho I Eng, M. Galinium, Maria Lamury\",\"doi\":\"10.1145/3557738.3557852\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) are technologies that enable text recognition. The difference between OCR and HTR is one designed specifically for digital text and one designed for handwritten text. There are already various implementations of OCR and HTR online. However, such systems do not guarantee the systems are in premises. To solve this problem, the OCR and HTR system must be built from the scratch. The purpose of this research is to improve the recognition by separating the text whether it is a handwritten or a printed text, which will later be forwarded into the appropriate recognition system. An application program interface (API) was also created in order to finalize the classification system into real world usage. In this research, the classification system being developed using convolutional neural network (CNN) method. To be able to reach the highest accuracy of the classification system, the experimentation and improvement on hyperparameters, dataset format, data augmentation and analysis on 3 CNN architectures were conducted. In the end of this research, there are 2 architectures in a tight competition on ideal data testing, one is VGG-16 with 90.63% accuracy and one is AlexNet with 90.17% accuracy. However, AlexNet is chosen as the winner after real data testing.\",\"PeriodicalId\":178760,\"journal\":{\"name\":\"Proceedings of the 2022 International Conference on Engineering and Information Technology for Sustainable Industry\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 International Conference on Engineering and Information Technology for Sustainable Industry\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3557738.3557852\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 International Conference on Engineering and Information Technology for Sustainable Industry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3557738.3557852","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep Learning Analysis in Development of Handwritten and Plain Text Classification API
Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) are technologies that enable text recognition. The difference between OCR and HTR is one designed specifically for digital text and one designed for handwritten text. There are already various implementations of OCR and HTR online. However, such systems do not guarantee the systems are in premises. To solve this problem, the OCR and HTR system must be built from the scratch. The purpose of this research is to improve the recognition by separating the text whether it is a handwritten or a printed text, which will later be forwarded into the appropriate recognition system. An application program interface (API) was also created in order to finalize the classification system into real world usage. In this research, the classification system being developed using convolutional neural network (CNN) method. To be able to reach the highest accuracy of the classification system, the experimentation and improvement on hyperparameters, dataset format, data augmentation and analysis on 3 CNN architectures were conducted. In the end of this research, there are 2 architectures in a tight competition on ideal data testing, one is VGG-16 with 90.63% accuracy and one is AlexNet with 90.17% accuracy. However, AlexNet is chosen as the winner after real data testing.