{"title":"Binarization and Segmentation of Kannada Handwritten Document Images","authors":"Vinod H.C, S. Niranjan","doi":"10.1109/ICGCIOT.2018.8753039","DOIUrl":null,"url":null,"abstract":"Binarization of document images is a major phase in the handwritten text recognition process. Text recognition process gives best result and easy to archive recognition for printed documents, but more accurate and fast Binarization & segmentation methods are required to achieve high accuracy in handwritten character recognition. In this paper we presenting two modules, they are Document Binarization & Segmentation. In Document Binarization carried out using Haar wavelet decomposition, laplacian mask, maximum gradient difference, median filter and morphological operators. Segmentation is done by the projection profile method and paragraph skew correction recursively until height of the segmented line image is less than 7% of the input image, Connected Component Analysis is used to segment words. These segmented words can be feed to OCR for recognition; the proposed experimental results are encouraging.","PeriodicalId":269682,"journal":{"name":"2018 Second International Conference on Green Computing and Internet of Things (ICGCIoT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Second International Conference on Green Computing and Internet of Things (ICGCIoT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICGCIOT.2018.8753039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Binarization of document images is a major phase in the handwritten text recognition process. Text recognition process gives best result and easy to archive recognition for printed documents, but more accurate and fast Binarization & segmentation methods are required to achieve high accuracy in handwritten character recognition. In this paper we presenting two modules, they are Document Binarization & Segmentation. In Document Binarization carried out using Haar wavelet decomposition, laplacian mask, maximum gradient difference, median filter and morphological operators. Segmentation is done by the projection profile method and paragraph skew correction recursively until height of the segmented line image is less than 7% of the input image, Connected Component Analysis is used to segment words. These segmented words can be feed to OCR for recognition; the proposed experimental results are encouraging.