A hybrid decision tree for printed Tamil character recognition using SVMs

M. Ramanan, A. Ramanan, E. Charles
{"title":"A hybrid decision tree for printed Tamil character recognition using SVMs","authors":"M. Ramanan, A. Ramanan, E. Charles","doi":"10.1109/ICTER.2015.7377685","DOIUrl":null,"url":null,"abstract":"Optical character recognition (OCR) is one of the important research areas in image processing and pattern recognition. OCR for printed Tamil text is considered as a challenging problem due to the large number of (i.e., 247) characters with complicated structures and, similarity between characters as well as different font styles. This paper proposes a novel approach for multiclass classification to recognise Tamil characters using binary support vector machines (SVMs) organised in a hybrid decision tree. The proposed decision tree is a binary rooted directed acyclic graph (DAG) which is succeeded by unbalanced decision trees (UDT). DAG implements OVO-based SVMs whereas UDT implements OVA-based SVMs. Each node of the hybrid decision tree exploits optimal feature subset in classifying the Tamil characters. The features used by the decision tree are basic, density, histogram of oriented gradients (HOG) and transition. Experiments have been carried out with a dataset of 12400 samples and the recognition rate observed is 98.80%with the hybrid approach of DAG and UDT SVMs using RBF kernel.","PeriodicalId":142561,"journal":{"name":"2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTER.2015.7377685","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Optical character recognition (OCR) is one of the important research areas in image processing and pattern recognition. OCR for printed Tamil text is considered as a challenging problem due to the large number of (i.e., 247) characters with complicated structures and, similarity between characters as well as different font styles. This paper proposes a novel approach for multiclass classification to recognise Tamil characters using binary support vector machines (SVMs) organised in a hybrid decision tree. The proposed decision tree is a binary rooted directed acyclic graph (DAG) which is succeeded by unbalanced decision trees (UDT). DAG implements OVO-based SVMs whereas UDT implements OVA-based SVMs. Each node of the hybrid decision tree exploits optimal feature subset in classifying the Tamil characters. The features used by the decision tree are basic, density, histogram of oriented gradients (HOG) and transition. Experiments have been carried out with a dataset of 12400 samples and the recognition rate observed is 98.80%with the hybrid approach of DAG and UDT SVMs using RBF kernel.
基于支持向量机的泰米尔文字识别混合决策树
光学字符识别(OCR)是图像处理和模式识别领域的重要研究方向之一。泰米尔文本的OCR被认为是一个具有挑战性的问题,因为大量(即247)字符具有复杂的结构,字符之间的相似性以及不同的字体样式。本文提出了一种基于混合决策树的二元支持向量机(svm)多类分类识别泰米尔语字符的新方法。所提出的决策树是由非平衡决策树(UDT)继承的二根有向无环图(DAG)。DAG实现基于ovo的svm,而UDT实现基于ova的svm。混合决策树的每个节点利用最优特征子集对泰米尔字符进行分类。决策树使用的特征有基本特征、密度特征、定向梯度直方图特征和过渡特征。在12400个样本的数据集上进行了实验,采用RBF核的DAG和UDT支持向量机混合方法的识别率为98.80%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信