{"title":"A Rule Based Approach for Skew Correction and Removal of Insignificant Data from Scanned Text Documents of Devanagari Script","authors":"P. Sharma, K. Dhingra, S. Sanyal","doi":"10.1109/SITIS.2007.93","DOIUrl":null,"url":null,"abstract":"In this paper we have presented a rule based approach for removing insignificant data and skew from scanned documents of Devanagari script. To develop an OCR system for Devanagari script is not an easy job hence proper preprocessing of these scanned documents requires noise removal and correcting skew from the image. The proposed system is based on rule based methods, morphological operations and connected component labeling. Images used for the experiment are binarised grayscale images. Experiments and results show that presented method is robust for preprocessing scanned images of Devanagari text documents.","PeriodicalId":234433,"journal":{"name":"2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SITIS.2007.93","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
In this paper we have presented a rule based approach for removing insignificant data and skew from scanned documents of Devanagari script. To develop an OCR system for Devanagari script is not an easy job hence proper preprocessing of these scanned documents requires noise removal and correcting skew from the image. The proposed system is based on rule based methods, morphological operations and connected component labeling. Images used for the experiment are binarised grayscale images. Experiments and results show that presented method is robust for preprocessing scanned images of Devanagari text documents.