{"title":"基于支持向量机的泰米尔文字识别混合决策树","authors":"M. Ramanan, A. Ramanan, E. Charles","doi":"10.1109/ICTER.2015.7377685","DOIUrl":null,"url":null,"abstract":"Optical character recognition (OCR) is one of the important research areas in image processing and pattern recognition. OCR for printed Tamil text is considered as a challenging problem due to the large number of (i.e., 247) characters with complicated structures and, similarity between characters as well as different font styles. This paper proposes a novel approach for multiclass classification to recognise Tamil characters using binary support vector machines (SVMs) organised in a hybrid decision tree. The proposed decision tree is a binary rooted directed acyclic graph (DAG) which is succeeded by unbalanced decision trees (UDT). DAG implements OVO-based SVMs whereas UDT implements OVA-based SVMs. Each node of the hybrid decision tree exploits optimal feature subset in classifying the Tamil characters. The features used by the decision tree are basic, density, histogram of oriented gradients (HOG) and transition. Experiments have been carried out with a dataset of 12400 samples and the recognition rate observed is 98.80%with the hybrid approach of DAG and UDT SVMs using RBF kernel.","PeriodicalId":142561,"journal":{"name":"2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"A hybrid decision tree for printed Tamil character recognition using SVMs\",\"authors\":\"M. Ramanan, A. Ramanan, E. Charles\",\"doi\":\"10.1109/ICTER.2015.7377685\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Optical character recognition (OCR) is one of the important research areas in image processing and pattern recognition. OCR for printed Tamil text is considered as a challenging problem due to the large number of (i.e., 247) characters with complicated structures and, similarity between characters as well as different font styles. This paper proposes a novel approach for multiclass classification to recognise Tamil characters using binary support vector machines (SVMs) organised in a hybrid decision tree. The proposed decision tree is a binary rooted directed acyclic graph (DAG) which is succeeded by unbalanced decision trees (UDT). DAG implements OVO-based SVMs whereas UDT implements OVA-based SVMs. Each node of the hybrid decision tree exploits optimal feature subset in classifying the Tamil characters. The features used by the decision tree are basic, density, histogram of oriented gradients (HOG) and transition. Experiments have been carried out with a dataset of 12400 samples and the recognition rate observed is 98.80%with the hybrid approach of DAG and UDT SVMs using RBF kernel.\",\"PeriodicalId\":142561,\"journal\":{\"name\":\"2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer)\",\"volume\":\"77 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTER.2015.7377685\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTER.2015.7377685","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A hybrid decision tree for printed Tamil character recognition using SVMs
Optical character recognition (OCR) is one of the important research areas in image processing and pattern recognition. OCR for printed Tamil text is considered as a challenging problem due to the large number of (i.e., 247) characters with complicated structures and, similarity between characters as well as different font styles. This paper proposes a novel approach for multiclass classification to recognise Tamil characters using binary support vector machines (SVMs) organised in a hybrid decision tree. The proposed decision tree is a binary rooted directed acyclic graph (DAG) which is succeeded by unbalanced decision trees (UDT). DAG implements OVO-based SVMs whereas UDT implements OVA-based SVMs. Each node of the hybrid decision tree exploits optimal feature subset in classifying the Tamil characters. The features used by the decision tree are basic, density, histogram of oriented gradients (HOG) and transition. Experiments have been carried out with a dataset of 12400 samples and the recognition rate observed is 98.80%with the hybrid approach of DAG and UDT SVMs using RBF kernel.