DAR '12Pub Date : 2012-12-16DOI: 10.1145/2432553.2432571
Munish Kumar, R. Sharma, M. Jindal
{"title":"Offline handwritten Gurmukhi character recognition: study of different feature-classifier combinations","authors":"Munish Kumar, R. Sharma, M. Jindal","doi":"10.1145/2432553.2432571","DOIUrl":"https://doi.org/10.1145/2432553.2432571","url":null,"abstract":"Offline handwritten character recognition (OHCR) is the method of converting handwritten text into machine processable layout. Since late sixties, efforts have been made for offline handwritten character recognition throughout the world. Principal Component Analysis (PCA) has also been used for extracting representative features for character recognition. In order to assess the prominence of features in offline handwritten Gurmukhi character recognition, we have recognized offline handwritten Gurmukhi characters with different combinations of features and classifiers. The recognition system first sets up a skeleton of the character so that significant feature information about the character can be extracted. For the purpose of classification, we have used k-NN, Linear-SVM, Polynomial-SVM and RBF-SVM based approaches. In present work, we have collected 7,000 samples of isolated offline handwritten Gurmukhi characters from 200 different writers. The set of basic 35 akhars of Gurmukhi has been considered here. A partitioning policy for selecting the training and testing patterns has also been experimented in present work. We have used zoning feature; diagonal feature; directional feature; intersection and open end points feature; transition feature; parabola curve fitting based feature and power curve fitting based feature extraction technique in order to find the feature set for a given character. The proposed system achieves a recognition accuracy of 94.8% when PCA is not applied and a recognition accuracy of 97.7% when PCA is applied.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127608031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DAR '12Pub Date : 2012-12-16DOI: 10.1145/2432553.2432562
S. Sundaram, Bhargava Urala K, A. Ramakrishnan
{"title":"Language models for online handwritten Tamil word recognition","authors":"S. Sundaram, Bhargava Urala K, A. Ramakrishnan","doi":"10.1145/2432553.2432562","DOIUrl":"https://doi.org/10.1145/2432553.2432562","url":null,"abstract":"N-gram language models and lexicon-based word-recognition are popular methods in the literature to improve recognition accuracies of online and offline handwritten data. However, there are very few works that deal with application of these techniques on online Tamil handwritten data. In this paper, we explore methods of developing symbol-level language models and a lexicon from a large Tamil text corpus and their application to improving symbol and word recognition accuracies. On a test database of around 2000 words, we find that bigram language models improve symbol (3%) and word recognition (8%) accuracies and while lexicon methods offer much greater improvements (30%) in terms of word recognition, there is a large dependency on choosing the right lexicon. For comparison to lexicon and language model based methods, we have also explored re-evaluation techniques which involve the use of expert classifiers to improve symbol and word recognition accuracies.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"30 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132982763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DAR '12Pub Date : 2012-12-16DOI: 10.1145/2432553.2432564
Sangeet Aggarwal, Sanjeev Kumar, Ritu Garg, S. Chaudhury
{"title":"Content directed enhancement of degraded document images","authors":"Sangeet Aggarwal, Sanjeev Kumar, Ritu Garg, S. Chaudhury","doi":"10.1145/2432553.2432564","DOIUrl":"https://doi.org/10.1145/2432553.2432564","url":null,"abstract":"Most of the document pre-processing techniques are parameter dependent. In this paper, we present a novel framework that learns optimal parameters, depending on the nature of the document image content for binarization and text/graphics segmentation. The learning problem has been formulated as an optimization problem using EM algorithm to adaptively learn optimal parameters. Experimental results have established the effectiveness of our approach.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123268014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}