Sanjoy Pratihar, Partha Bhowmick, S. Sural, J. Mukhopadhyay
{"title":"通过数字几何分析和绘图从文档图像中去除手绘注释线","authors":"Sanjoy Pratihar, Partha Bhowmick, S. Sural, J. Mukhopadhyay","doi":"10.1109/NCVPRIPG.2013.6776179","DOIUrl":null,"url":null,"abstract":"Performance of an OCR system is badly affected due to presence of hand-drawn annotation lines in various forms, such as underlines, circular lines, and other text-surrounding curves. Such annotation lines are drawn by a reader usually in free hand in order to summarize some text or to mark the keywords within a document page. In this paper, we propose a generalized scheme for detection and removal of these hand-drawn annotations from a scanned document page. An underline drawn by hand is roughly horizontal or has a tolerable undulation, whereas for a hand-drawn curved line, the slope usually changes at a gradual pace. Based on this observation, we detect the cover of an annotation object-be it straight or curved-as a sequence of straight edge segments. The novelty of the proposed method lies in its ability to compute the exact cover of the annotation object, even when it touches or passes through any text character. After getting the annotation cover, an effective method of inpainting is used to quantify the regions where text reconstruction is needed. We have done our experimentation with various documents written in English, and some results are presented here to show the efficiency and robustness of the proposed method.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Removal of hand-drawn annotation lines from document images by digital-geometric analysis and inpainting\",\"authors\":\"Sanjoy Pratihar, Partha Bhowmick, S. Sural, J. Mukhopadhyay\",\"doi\":\"10.1109/NCVPRIPG.2013.6776179\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Performance of an OCR system is badly affected due to presence of hand-drawn annotation lines in various forms, such as underlines, circular lines, and other text-surrounding curves. Such annotation lines are drawn by a reader usually in free hand in order to summarize some text or to mark the keywords within a document page. In this paper, we propose a generalized scheme for detection and removal of these hand-drawn annotations from a scanned document page. An underline drawn by hand is roughly horizontal or has a tolerable undulation, whereas for a hand-drawn curved line, the slope usually changes at a gradual pace. Based on this observation, we detect the cover of an annotation object-be it straight or curved-as a sequence of straight edge segments. The novelty of the proposed method lies in its ability to compute the exact cover of the annotation object, even when it touches or passes through any text character. After getting the annotation cover, an effective method of inpainting is used to quantify the regions where text reconstruction is needed. We have done our experimentation with various documents written in English, and some results are presented here to show the efficiency and robustness of the proposed method.\",\"PeriodicalId\":436402,\"journal\":{\"name\":\"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCVPRIPG.2013.6776179\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCVPRIPG.2013.6776179","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Removal of hand-drawn annotation lines from document images by digital-geometric analysis and inpainting
Performance of an OCR system is badly affected due to presence of hand-drawn annotation lines in various forms, such as underlines, circular lines, and other text-surrounding curves. Such annotation lines are drawn by a reader usually in free hand in order to summarize some text or to mark the keywords within a document page. In this paper, we propose a generalized scheme for detection and removal of these hand-drawn annotations from a scanned document page. An underline drawn by hand is roughly horizontal or has a tolerable undulation, whereas for a hand-drawn curved line, the slope usually changes at a gradual pace. Based on this observation, we detect the cover of an annotation object-be it straight or curved-as a sequence of straight edge segments. The novelty of the proposed method lies in its ability to compute the exact cover of the annotation object, even when it touches or passes through any text character. After getting the annotation cover, an effective method of inpainting is used to quantify the regions where text reconstruction is needed. We have done our experimentation with various documents written in English, and some results are presented here to show the efficiency and robustness of the proposed method.