{"title":"Kannada Handwritten Script Recognition using Machine Learning Techniques","authors":"Roshan Fernandes, Anisha P. Rodrigues","doi":"10.1109/DISCOVER47552.2019.9008097","DOIUrl":null,"url":null,"abstract":"Many researchers have contributed to automate the optical character recognition. But handwritten character recognition is still an uncompleted task. In this paper we are proposing two techniques to recognize handwritten Kannada script, which yields high accuracy compared to previous works. There are lot of challenges in recognizing handwritten Kannada scripts. Few of the challenges include: each person have their own handwriting, there is no uniform spacing between alphabets, words and lines. Another main problem when it comes to Kannada language is that there is no large dataset available to train the recognition system, and it is challenging to write all combinations of each alphabet in Kannada script. In the proposed work, we have gathered the handwritten training set from the Web and from the students of our campus and segmented each letter. We have proposed two methods to recognize the handwritten Kannada characters. The first techniques is by Tesseract tool, and second is by using Convolution Neural Network (CNN). With Tesseract tool we have achieved 86% accuracy and through Convolution Neural Network we achieved87% accuracy although it might improve with the data set chosen and further enhanced image processing. The main idea behind this work is to extract text from the scanned images, identify the Kannada letters in it accurately and display or store it for further usage.","PeriodicalId":274260,"journal":{"name":"2019 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DISCOVER47552.2019.9008097","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
Many researchers have contributed to automate the optical character recognition. But handwritten character recognition is still an uncompleted task. In this paper we are proposing two techniques to recognize handwritten Kannada script, which yields high accuracy compared to previous works. There are lot of challenges in recognizing handwritten Kannada scripts. Few of the challenges include: each person have their own handwriting, there is no uniform spacing between alphabets, words and lines. Another main problem when it comes to Kannada language is that there is no large dataset available to train the recognition system, and it is challenging to write all combinations of each alphabet in Kannada script. In the proposed work, we have gathered the handwritten training set from the Web and from the students of our campus and segmented each letter. We have proposed two methods to recognize the handwritten Kannada characters. The first techniques is by Tesseract tool, and second is by using Convolution Neural Network (CNN). With Tesseract tool we have achieved 86% accuracy and through Convolution Neural Network we achieved87% accuracy although it might improve with the data set chosen and further enhanced image processing. The main idea behind this work is to extract text from the scanned images, identify the Kannada letters in it accurately and display or store it for further usage.