Document image analysis and recognition: a survey

IF 1.2 Q4 OPTICS

Computer Optics Pub Date : 2022-08-01 DOI:10.18287/2412-6179-co-1020

V. Arlazarov, E.I. Andreeva, K. Bulatov, D. Nikolaev, O. Petrova, B. I. Savelev, O. Slavin

{"title":"Document image analysis and recognition: a survey","authors":"V. Arlazarov, E.I. Andreeva, K. Bulatov, D. Nikolaev, O. Petrova, B. I. Savelev, O. Slavin","doi":"10.18287/2412-6179-co-1020","DOIUrl":null,"url":null,"abstract":"This paper analyzes the problems of document image recognition and the existing solutions. Document recognition algorithms have been studied for quite a long time, but despite this, currently, the topic is relevant and research continues, as evidenced by a large number of associated publications and reviews. However, most of these works and reviews are devoted to individual recognition tasks. In this review, the entire set of methods, approaches, and algorithms necessary for document recognition is considered. A preliminary systematization allowed us to distinguish groups of methods for extracting information from documents of different types: single-page and multi-page, with text and handwritten contents, with a fixed template and flexible structure, and digitalized via different ways: scanning, photographing, video recording. Here, we consider methods of document recognition and analysis applied to a wide range of tasks: identification and verification of identity, due diligence, machine learning algorithms, questionnaires, and audits. The groups of methods necessary for the recognition of a single page image are examined: the classical computer vision algorithms, i.e., keypoints, local feature descriptors, Fast Hough Transforms, image binarization, and modern neural network models for document boundary detection, document classification, document structure analysis, i.e., text blocks and tables localization, extraction and recognition of the details, post-processing of recognition results. The review provides a description of publicly available experimental data packages for training and testing recognition algorithms. Methods for optimizing the performance of document image analysis and recognition methods are described.","PeriodicalId":46692,"journal":{"name":"Computer Optics","volume":"100 1","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Optics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18287/2412-6179-co-1020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"OPTICS","Score":null,"Total":0}

引用次数: 4

Abstract

This paper analyzes the problems of document image recognition and the existing solutions. Document recognition algorithms have been studied for quite a long time, but despite this, currently, the topic is relevant and research continues, as evidenced by a large number of associated publications and reviews. However, most of these works and reviews are devoted to individual recognition tasks. In this review, the entire set of methods, approaches, and algorithms necessary for document recognition is considered. A preliminary systematization allowed us to distinguish groups of methods for extracting information from documents of different types: single-page and multi-page, with text and handwritten contents, with a fixed template and flexible structure, and digitalized via different ways: scanning, photographing, video recording. Here, we consider methods of document recognition and analysis applied to a wide range of tasks: identification and verification of identity, due diligence, machine learning algorithms, questionnaires, and audits. The groups of methods necessary for the recognition of a single page image are examined: the classical computer vision algorithms, i.e., keypoints, local feature descriptors, Fast Hough Transforms, image binarization, and modern neural network models for document boundary detection, document classification, document structure analysis, i.e., text blocks and tables localization, extraction and recognition of the details, post-processing of recognition results. The review provides a description of publicly available experimental data packages for training and testing recognition algorithms. Methods for optimizing the performance of document image analysis and recognition methods are described.

查看原文本刊更多论文

文献图像分析与识别综述

本文分析了文档图像识别中存在的问题及现有的解决方案。文档识别算法已经研究了很长时间，但尽管如此，目前，这个话题是相关的，研究还在继续，大量相关的出版物和评论证明了这一点。然而，这些工作和评论大多致力于个人识别任务。在这篇综述中，考虑了文档识别所需的一整套方法、途径和算法。通过初步的系统化，我们可以区分出不同类型文档的信息提取方法组:单页和多页，文本和手写内容，模板固定和结构灵活，通过扫描、拍照、录像等不同方式进行数字化。在这里，我们考虑了应用于广泛任务的文档识别和分析方法:身份识别和验证、尽职调查、机器学习算法、问卷调查和审计。研究了单页图像识别所必需的方法组:经典的计算机视觉算法，即关键点、局部特征描述符、快速霍夫变换、图像二值化和现代神经网络模型，用于文档边界检测、文档分类、文档结构分析，即文本块和表的定位、细节的提取和识别、识别结果的后处理。该评论提供了用于训练和测试识别算法的公开可用实验数据包的描述。描述了优化文档图像分析和识别性能的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Optics OPTICS-

CiteScore

4.20

自引率

10.00%

发文量

审稿时长

9 weeks

期刊介绍： The journal is intended for researchers and specialists active in the following research areas: Diffractive Optics; Information Optical Technology; Nanophotonics and Optics of Nanostructures; Image Analysis & Understanding; Information Coding & Security; Earth Remote Sensing Technologies; Hyperspectral Data Analysis; Numerical Methods for Optics and Image Processing; Intelligent Video Analysis. The journal "Computer Optics" has been published since 1987. Published 6 issues per year.