2019 International Conference on Document Analysis and Recognition (ICDAR)最新文献_第5页

A Synthetic Recipe for OCR OCR的合成配方

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00143

David Etter, Stephen Rawls, Cameron Carpenter, Gregory Sell

引用次数: 16

ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text - RRC-ArT ICDAR2019基于任意形状文本的鲁棒阅读挑战- RRC-ArT

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00252

Chee-Kheng Chng, Yuliang Liu, Yipeng Sun, Chun Chet Ng, Canjie Luo, Zihan Ni, Chuanming Fang, Shuaitao Zhang, Junyu Han, Errui Ding, Jingtuo Liu, Dimosthenis Karatzas, Chee Seng Chan, Lianwen Jin

引用次数: 124

A Comparative Study of Attention-Based Encoder-Decoder Approaches to Natural Scene Text Recognition 基于注意的自然场景文本识别编码器-解码器方法的比较研究

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00151

Fu'ze Cong, Wenping Hu, Qiang Huo, Li Guo

{"title":"A Comparative Study of Attention-Based Encoder-Decoder Approaches to Natural Scene Text Recognition","authors":"Fu'ze Cong, Wenping Hu, Qiang Huo, Li Guo","doi":"10.1109/ICDAR.2019.00151","DOIUrl":"https://doi.org/10.1109/ICDAR.2019.00151","url":null,"abstract":"Attention-based encoder-decoder approaches have shown promising results in scene text recognition. In the literature, models with different encoders, decoders and attention mechanisms have been proposed and compared on isolated word recognition tasks, where the models are trained on either synthetic word images or a small set of real-world images. In this paper, we investigate different components of the attention based framework and compare its performance with a CNN-DBLSTM-CTC based approach on large-scale real-world scene text sentence recognition tasks. We train character models by using more than 1.6M real-world text lines and compare their performance on test sets collected from a variety of real-world scenarios. Our results show that (1) attention on a two-dimensional feature map can yield better performance than one-dimensional one and an RNN based decoder performs better than CNN based one; (2) attention-based approaches can achieve higher recognition accuracy than CNN-DBLSTM-CTC based approaches on isolated word recognition tasks, but perform worse on sentence recognition tasks; (3) it is more effective and efficient for CNN-DBLSTM-CTC based approaches to leverage an explicit language model to boost recognition accuracy.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133041894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Deep Splitting and Merging for Table Structure Decomposition 表结构分解的深度拆分和合并

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00027

Chris Tensmeyer, Vlad I. Morariu, Brian L. Price, Scott D. Cohen, Tony R. Martinez

引用次数: 55

Improving Text Recognition using Optical and Language Model Writer Adaptation 利用光学和语言模型书写器自适应改进文本识别

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00190

Yann Soullard, Wassim Swaileh, Pierrick Tranouez, T. Paquet, Clément Chatelain

引用次数: 13

Learning Free Line Detection in Manuscripts using Distance Transform Graph 利用距离变换图学习手稿中的自由线检测

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00044

M. Kassis, Jihad El-Sana

{"title":"Learning Free Line Detection in Manuscripts using Distance Transform Graph","authors":"M. Kassis, Jihad El-Sana","doi":"10.1109/ICDAR.2019.00044","DOIUrl":"https://doi.org/10.1109/ICDAR.2019.00044","url":null,"abstract":"We present a fully automated learning free method, for line detection in manuscripts. We begin by separating components that span over multiple lines, then we remove noise, and small connected components such as diacritics. We apply a distance transform on the image to create the image skeleton. The skeleton is pruned, its vertexes and edges are detected, in order to generate the initial document graph. We calculate the vertex v-score using its t-score and l-score quantifying its distance from being an absolute link in a line. In a greedy manner we classify each edge in the graph either a link, a bridge or a conflict edge. We merge every two edges classified as link together, then we merge the conflict edges next. Finally we remove the bridge edges from the graph generating the final form of the graph. Each edge in the graph equals to one extracted line. We applied the method on the DIVA-hisDB dataset on both public and private sections. The public section participated in the recently conducted Layout Analysis for Challenging Medieval Manuscripts Competition, and we have achieved results surpassing the vast majority of these systems.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115092749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A Multi-oriented Chinese Keyword Spotter Guided by Text Line Detection 基于文本行检测的多方向中文关键词定位器

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/icdar.2019.00112

Pei Xu, Shan Huang, Hongzhen Wang, Hao Song, Shen Huang, Qi Ju

引用次数: 0

A Genetic-Based Search for Adaptive Table Recognition in Spreadsheets 电子表格中自适应表识别的遗传搜索

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00206

Elvis Koci, Maik Thiele, Oscar Romero, Wolfgang Lehner

引用次数: 13

TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images tableet:端到端表检测和扫描文档图像表数据提取的深度学习模型

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00029

Shubham Paliwal, D. Vishwanath, R. Rahul, Monika Sharma, L. Vig

{"title":"TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images","authors":"Shubham Paliwal, D. Vishwanath, R. Rahul, Monika Sharma, L. Vig","doi":"10.1109/ICDAR.2019.00029","DOIUrl":"https://doi.org/10.1109/ICDAR.2019.00029","url":null,"abstract":"With the widespread use of mobile phones and scanners to photograph and upload documents, the need for extracting the information trapped in unstructured document images such as retail receipts, insurance claim forms and financial invoices is becoming more acute. A major hurdle to this objective is that these images often contain information in the form of tables and extracting data from tabular sub-images presents a unique set of challenges. This includes accurate detection of the tabular region within an image, and subsequently detecting and extracting information from the rows and columns of the detected table. While some progress has been made in table detection, extracting the table contents is still a challenge since this involves more fine grained table structure(rows & columns) recognition. Prior approaches have attempted to solve the table detection and structure recognition problems independently using two separate models. In this paper, we propose TableNet: a novel end-to-end deep learning model for both table detection and structure recognition. The model exploits the interdependence between the twin tasks of table detection and table structure recognition to segment out the table and column regions. This is followed by semantic rule-based row extraction from the identified tabular sub-regions. The proposed model and extraction approach was evaluated on the publicly available ICDAR 2013 and Marmot Table datasets obtaining state of the art results. Additionally, we demonstrate that feeding additional semantic features further improves model performance and that the model exhibits transfer learning across datasets. Another contribution of this paper is to provide additional table structure annotations for the Marmot data, which currently only has annotations for table detection.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114382703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 107

Blind Source Separation Based Framework for Multispectral Document Images Binarization 基于盲源分离的多光谱文档图像二值化框架

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00237

Abderrahmane Rahiche, A. Bakhta, M. Cheriet

{"title":"Blind Source Separation Based Framework for Multispectral Document Images Binarization","authors":"Abderrahmane Rahiche, A. Bakhta, M. Cheriet","doi":"10.1109/ICDAR.2019.00237","DOIUrl":"https://doi.org/10.1109/ICDAR.2019.00237","url":null,"abstract":"In this paper, we propose a novel Blind Source Separation (BSS) based framework for multispectral (MS) document images binarization. This framework takes advantage of the multidimensional data representation of MS images and makes use of the Graph regularized Non-negative Matrix Factorization (GNMF) to decompose MS document images into their different constituting components, i.e., foreground (text, ink), background (paper, parchment), degradation information, etc. The proposed framework is validated on two different real-world data sets of manuscript images showing a high capability of dealing with: variable numbers of bands regardless of the acquisition protocol, different types of degradations, and illumination non-uniformity while outperforming the results reported in the state-of-the-art. Although the focus was put on the binary separation (i.e., foreground/background), the proposed framework is also used for the decomposition of document images into different components, i.e., background, text, and degradation, which allows full sources separation, whereby further analysis and characterization of each component can be possible. A comparative study is performed using Independent Component Analysis (ICA) and Principal Component Analysis (PCA) methods. Our framework is also validated on another third dataset of MS images of natural objects to demonstrate its generalizability beyond document samples.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114442377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2