Utilizing a Two Planes Model to Rectify Documents With a Single Arbitrary Crease

IF 3.4 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Aleksandr Ershov;Daniil Tropin;Danil Kazimirov;Konstantin Bulatov;Dmitry Nikolaev
{"title":"Utilizing a Two Planes Model to Rectify Documents With a Single Arbitrary Crease","authors":"Aleksandr Ershov;Daniil Tropin;Danil Kazimirov;Konstantin Bulatov;Dmitry Nikolaev","doi":"10.1109/ACCESS.2024.3474099","DOIUrl":null,"url":null,"abstract":"Document image rectification problem is crucial in document analysis. Most of the current state-of-the-art methods addressing it are data-driven and rely on neural network approaches. However, despite satisfactory rectifications, such methods’ time performance is poor, making them unsuitable for mobile on-device acquisition. The present work concentrates on a specific (but common) case of document physical distortion – the documents with a single crease. We investigate the properties of a surface comprised of two planes captured by a pinhole camera. Namely, we provide the methods to obtain the transformation between such an image and the template image having successfully localized the document in a frame. It can be utilized in on-device recognition systems: it takes only 3 ms to estimate transformation parameters and about a quarter of a second to rectify an image on a smartphone CPU. We propose a novel dataset FDI-AC containing 200 real images of documents with a single crease in different positions. We conduct experiments comparing our approach with the current state-of-the-art setting a baseline performance on FDI-AC. These experiments show that the proposed algorithm outperforms image rectification transformer network GeoTr in rectification accuracy and time performance.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"12 ","pages":"147073-147086"},"PeriodicalIF":3.4000,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10705295","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10705295/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Document image rectification problem is crucial in document analysis. Most of the current state-of-the-art methods addressing it are data-driven and rely on neural network approaches. However, despite satisfactory rectifications, such methods’ time performance is poor, making them unsuitable for mobile on-device acquisition. The present work concentrates on a specific (but common) case of document physical distortion – the documents with a single crease. We investigate the properties of a surface comprised of two planes captured by a pinhole camera. Namely, we provide the methods to obtain the transformation between such an image and the template image having successfully localized the document in a frame. It can be utilized in on-device recognition systems: it takes only 3 ms to estimate transformation parameters and about a quarter of a second to rectify an image on a smartphone CPU. We propose a novel dataset FDI-AC containing 200 real images of documents with a single crease in different positions. We conduct experiments comparing our approach with the current state-of-the-art setting a baseline performance on FDI-AC. These experiments show that the proposed algorithm outperforms image rectification transformer network GeoTr in rectification accuracy and time performance.
利用双平面模型校正带有单个任意折痕的文件
文档图像校正问题在文档分析中至关重要。目前解决这一问题的最先进方法大多是数据驱动型的,依赖于神经网络方法。然而,尽管矫正效果令人满意,但这些方法的时间性能较差,因此不适合移动设备采集。目前的工作主要集中在一种特殊(但常见)的文档物理失真情况--有单个折痕的文档。我们研究了针孔摄像头捕捉到的由两个平面组成的表面的特性。也就是说,我们提供了获取这种图像与模板图像之间变换的方法,并成功地将文档定位在一个框架中。它可用于设备上的识别系统:估计变换参数只需 3 毫秒,在智能手机 CPU 上校正图像只需约四分之一秒。我们提出了一个新颖的数据集 FDI-AC,其中包含 200 幅不同位置单折痕文档的真实图像。我们进行了实验,将我们的方法与当前最先进的方法进行比较,并在 FDI-AC 上设定了基准性能。这些实验表明,所提出的算法在纠正精度和时间性能上都优于图像纠正变换器网络 GeoTr。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Access
IEEE Access COMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
9.80
自引率
7.70%
发文量
6673
审稿时长
6 weeks
期刊介绍: IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest. IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on: Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals. Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering. Development of new or improved fabrication or manufacturing techniques. Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信