{"title":"Kannada Handwritten Character Recognition and Classification Through OCR Using Hybrid Machine Learning Techniques","authors":"Deekshitha Gowda, V. Kanchana","doi":"10.1109/ICDSIS55133.2022.9915906","DOIUrl":null,"url":null,"abstract":"In many workplaces in Karnataka the documents are in regional language and it is handwritten. Consequently, there is a requirement for a PC based framework to beat the gap among machines and people. There is a lot of challenges faced when converting these handwritten documents to computer editable format. One of the challenges faced is in classifying confounding characters which are many in Kannada which may recognize wrongly due to the way the characters are written. The scanned handwritten document was pre-processed then segmented into line, word and character ouring Edge based segmentation. The feature extracted mostly based on the curviness of the characters using Convolutional Neural Networks. The segmented and feature extracted characters are further classified using Support Vector Machines, K Nearest Neighbors and Random Forest algorithms. The accuracy rates obtained based on 2000 handwritten documents where Random Forest-95%, Support Vector Machine - 96%, K Nearest Neighbors-92%.","PeriodicalId":178360,"journal":{"name":"2022 IEEE International Conference on Data Science and Information System (ICDSIS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Data Science and Information System (ICDSIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDSIS55133.2022.9915906","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In many workplaces in Karnataka the documents are in regional language and it is handwritten. Consequently, there is a requirement for a PC based framework to beat the gap among machines and people. There is a lot of challenges faced when converting these handwritten documents to computer editable format. One of the challenges faced is in classifying confounding characters which are many in Kannada which may recognize wrongly due to the way the characters are written. The scanned handwritten document was pre-processed then segmented into line, word and character ouring Edge based segmentation. The feature extracted mostly based on the curviness of the characters using Convolutional Neural Networks. The segmented and feature extracted characters are further classified using Support Vector Machines, K Nearest Neighbors and Random Forest algorithms. The accuracy rates obtained based on 2000 handwritten documents where Random Forest-95%, Support Vector Machine - 96%, K Nearest Neighbors-92%.