Robust Document Image Dewarping Method Using Text-Lines and Line Segments

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2017-11-01 DOI:10.1109/ICDAR.2017.146

T. Kil, Wonkyo Seo, H. Koo, N. Cho

{"title":"Robust Document Image Dewarping Method Using Text-Lines and Line Segments","authors":"T. Kil, Wonkyo Seo, H. Koo, N. Cho","doi":"10.1109/ICDAR.2017.146","DOIUrl":null,"url":null,"abstract":"Conventional text-line based document dewarping methods have problems when handling complex layout and/or very few text-lines. When there are few aligned text-lines in the image, this usually means that photos, graphics and/or tables take large portion of the input instead. Hence, for the robust document dewarping, we propose to use line segments in the image in addition to the aligned text-lines. Based on the assumption and observation that many of the line segments in the image are horizontally or vertically aligned in the well-rectified images, we encode this property into the cost function in addition to the text-line alignment cost. By minimizing the function, we can obtain transformation parameters for camera pose, page curve, etc., which are used for document rectification. Considering that there are many outliers in line segment directions and missed text-lines in some cases, the overall algorithm is designed in an iterative manner. At each step, we remove text components and line segments that are not well aligned, and then minimize the cost function with the updated information. Experimental results show that the proposed method is robust to the variety of page layouts.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2017.146","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 28

Abstract

Conventional text-line based document dewarping methods have problems when handling complex layout and/or very few text-lines. When there are few aligned text-lines in the image, this usually means that photos, graphics and/or tables take large portion of the input instead. Hence, for the robust document dewarping, we propose to use line segments in the image in addition to the aligned text-lines. Based on the assumption and observation that many of the line segments in the image are horizontally or vertically aligned in the well-rectified images, we encode this property into the cost function in addition to the text-line alignment cost. By minimizing the function, we can obtain transformation parameters for camera pose, page curve, etc., which are used for document rectification. Considering that there are many outliers in line segment directions and missed text-lines in some cases, the overall algorithm is designed in an iterative manner. At each step, we remove text components and line segments that are not well aligned, and then minimize the cost function with the updated information. Experimental results show that the proposed method is robust to the variety of page layouts.

查看原文本刊更多论文

基于文本行和线段的鲁棒文档图像去翘曲方法

传统的基于文本行的文档去翘曲方法在处理复杂的布局和/或很少的文本行时存在问题。当图像中对齐的文本行很少时，这通常意味着照片、图形和/或表格占据了大部分输入。因此，对于健壮的文档去翘曲，我们建议在对齐的文本行之外使用图像中的线段。基于图像中的许多线段在经过良好校正的图像中是水平或垂直对齐的假设和观察，除了文本行对齐成本之外，我们还将该属性编码到成本函数中。通过最小化该函数，我们可以得到相机姿态、页面曲线等的变换参数，用于文档的校正。考虑到线段方向上有很多离群点，在某些情况下遗漏文本行，整体算法采用迭代的方式进行设计。在每一步中，我们删除没有很好对齐的文本组件和线段，然后使用更新的信息最小化成本函数。实验结果表明，该方法对多种页面布局具有较强的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)

自引率

0.00%

发文量