Pre-Processing of Degraded Printed Documents by Non-local Means and Total Variation

Laurence Likforman-Sulem, J. Darbon, E. B. Smith
{"title":"Pre-Processing of Degraded Printed Documents by Non-local Means and Total Variation","authors":"Laurence Likforman-Sulem, J. Darbon, E. B. Smith","doi":"10.1109/ICDAR.2009.210","DOIUrl":null,"url":null,"abstract":"We compare in this study two image restoration approaches for the pre-processing of printed documents:namely the Non-local Means filter and a total variation minimization approach. We apply these two approaches to printed document sets from various periods,and we evaluate their effectiveness through character recognition performance using an open source OCR. Our results show that for each document set, one or both pre-processing methods improve character recog-nition accuracy over recognition without preprocessing. Higher accuracies are obtained with Non-local Means when characters have a low level of degradation since they can be restored by similar neighboring parts of non-degraded characters. The Total Variation approach is more effective when characters are highly degraded and can only be restored through modeling instead of using neighboring data.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 10th International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2009.210","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

Abstract

We compare in this study two image restoration approaches for the pre-processing of printed documents:namely the Non-local Means filter and a total variation minimization approach. We apply these two approaches to printed document sets from various periods,and we evaluate their effectiveness through character recognition performance using an open source OCR. Our results show that for each document set, one or both pre-processing methods improve character recog-nition accuracy over recognition without preprocessing. Higher accuracies are obtained with Non-local Means when characters have a low level of degradation since they can be restored by similar neighboring parts of non-degraded characters. The Total Variation approach is more effective when characters are highly degraded and can only be restored through modeling instead of using neighboring data.
基于非局部均值和总变分的退化打印文件预处理
在本研究中,我们比较了两种用于打印文档预处理的图像恢复方法:即非局部均值滤波和总变差最小化方法。我们将这两种方法应用于不同时期的打印文档集,并通过使用开源OCR的字符识别性能来评估它们的有效性。我们的结果表明,对于每个文档集,一种或两种预处理方法都比不进行预处理的识别方法提高了字符识别的准确性。当字符退化程度较低时,使用非局部均值可以通过非退化字符的相似邻近部分来恢复字符,从而获得较高的精度。当特征严重退化且只能通过建模来恢复而不能使用邻近数据时,全变分方法更为有效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信