{"title":"基于规则的梵文文本扫描文档歪斜校正与无关数据去除方法","authors":"P. Sharma, K. Dhingra, S. Sanyal","doi":"10.1109/SITIS.2007.93","DOIUrl":null,"url":null,"abstract":"In this paper we have presented a rule based approach for removing insignificant data and skew from scanned documents of Devanagari script. To develop an OCR system for Devanagari script is not an easy job hence proper preprocessing of these scanned documents requires noise removal and correcting skew from the image. The proposed system is based on rule based methods, morphological operations and connected component labeling. Images used for the experiment are binarised grayscale images. Experiments and results show that presented method is robust for preprocessing scanned images of Devanagari text documents.","PeriodicalId":234433,"journal":{"name":"2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"A Rule Based Approach for Skew Correction and Removal of Insignificant Data from Scanned Text Documents of Devanagari Script\",\"authors\":\"P. Sharma, K. Dhingra, S. Sanyal\",\"doi\":\"10.1109/SITIS.2007.93\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we have presented a rule based approach for removing insignificant data and skew from scanned documents of Devanagari script. To develop an OCR system for Devanagari script is not an easy job hence proper preprocessing of these scanned documents requires noise removal and correcting skew from the image. The proposed system is based on rule based methods, morphological operations and connected component labeling. Images used for the experiment are binarised grayscale images. Experiments and results show that presented method is robust for preprocessing scanned images of Devanagari text documents.\",\"PeriodicalId\":234433,\"journal\":{\"name\":\"2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SITIS.2007.93\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SITIS.2007.93","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Rule Based Approach for Skew Correction and Removal of Insignificant Data from Scanned Text Documents of Devanagari Script
In this paper we have presented a rule based approach for removing insignificant data and skew from scanned documents of Devanagari script. To develop an OCR system for Devanagari script is not an easy job hence proper preprocessing of these scanned documents requires noise removal and correcting skew from the image. The proposed system is based on rule based methods, morphological operations and connected component labeling. Images used for the experiment are binarised grayscale images. Experiments and results show that presented method is robust for preprocessing scanned images of Devanagari text documents.