{"title":"Printed thai character segmentation and recognition","authors":"P. Chomphuwiset","doi":"10.1109/ISCMI.2017.8279611","DOIUrl":null,"url":null,"abstract":"This paper presents a techniques for recognizing printed Thai-characters. The work is divided into 2 folds. Character segmentation is firstly carried out. A connected component analysis technique is implemented to form a character boundary and extract character segments in images. Secondly, segmented characters are classified/recognized using a feature-based technique and a Convolution Neural Network (CNN). In the feature-based approach, a character image is divided into 9 regions. Each local region generates local features. The local features are concatenated resulting a global descriptor for classification. There are 66 classes of the characters. The data is collected from a gold standard data set, BEST data set. The data set contains Thai characters and some special characters, which are divided into 66 classes. Experiments are conducted and the result shows that the CNN provide the best results on the data set — obtaining 98% of accuracy. In addition, the segmentation and recognition is combined and produces promising results.","PeriodicalId":119111,"journal":{"name":"2017 IEEE 4th International Conference on Soft Computing & Machine Intelligence (ISCMI)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 4th International Conference on Soft Computing & Machine Intelligence (ISCMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCMI.2017.8279611","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
This paper presents a techniques for recognizing printed Thai-characters. The work is divided into 2 folds. Character segmentation is firstly carried out. A connected component analysis technique is implemented to form a character boundary and extract character segments in images. Secondly, segmented characters are classified/recognized using a feature-based technique and a Convolution Neural Network (CNN). In the feature-based approach, a character image is divided into 9 regions. Each local region generates local features. The local features are concatenated resulting a global descriptor for classification. There are 66 classes of the characters. The data is collected from a gold standard data set, BEST data set. The data set contains Thai characters and some special characters, which are divided into 66 classes. Experiments are conducted and the result shows that the CNN provide the best results on the data set — obtaining 98% of accuracy. In addition, the segmentation and recognition is combined and produces promising results.