{"title":"A hybrid decision tree for printed Tamil character recognition using SVMs","authors":"M. Ramanan, A. Ramanan, E. Charles","doi":"10.1109/ICTER.2015.7377685","DOIUrl":null,"url":null,"abstract":"Optical character recognition (OCR) is one of the important research areas in image processing and pattern recognition. OCR for printed Tamil text is considered as a challenging problem due to the large number of (i.e., 247) characters with complicated structures and, similarity between characters as well as different font styles. This paper proposes a novel approach for multiclass classification to recognise Tamil characters using binary support vector machines (SVMs) organised in a hybrid decision tree. The proposed decision tree is a binary rooted directed acyclic graph (DAG) which is succeeded by unbalanced decision trees (UDT). DAG implements OVO-based SVMs whereas UDT implements OVA-based SVMs. Each node of the hybrid decision tree exploits optimal feature subset in classifying the Tamil characters. The features used by the decision tree are basic, density, histogram of oriented gradients (HOG) and transition. Experiments have been carried out with a dataset of 12400 samples and the recognition rate observed is 98.80%with the hybrid approach of DAG and UDT SVMs using RBF kernel.","PeriodicalId":142561,"journal":{"name":"2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTER.2015.7377685","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Optical character recognition (OCR) is one of the important research areas in image processing and pattern recognition. OCR for printed Tamil text is considered as a challenging problem due to the large number of (i.e., 247) characters with complicated structures and, similarity between characters as well as different font styles. This paper proposes a novel approach for multiclass classification to recognise Tamil characters using binary support vector machines (SVMs) organised in a hybrid decision tree. The proposed decision tree is a binary rooted directed acyclic graph (DAG) which is succeeded by unbalanced decision trees (UDT). DAG implements OVO-based SVMs whereas UDT implements OVA-based SVMs. Each node of the hybrid decision tree exploits optimal feature subset in classifying the Tamil characters. The features used by the decision tree are basic, density, histogram of oriented gradients (HOG) and transition. Experiments have been carried out with a dataset of 12400 samples and the recognition rate observed is 98.80%with the hybrid approach of DAG and UDT SVMs using RBF kernel.