2019 International Conference on Document Analysis and Recognition (ICDAR)最新文献_第9页

Online Writer Identification using GMM Based Feature Representation and Writer-Specific Weights 基于GMM的特征表示和作者特定权重的在线作者识别

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00124

V. Venugopal, S. Sundaram

引用次数: 1

TH-GAN: Generative Adversarial Network Based Transfer Learning for Historical Chinese Character Recognition 基于生成对抗网络的历史汉字识别迁移学习

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00037

Junyang Cai, Liangrui Peng, Yejun Tang, Changsong Liu, Pengchao Li

{"title":"TH-GAN: Generative Adversarial Network Based Transfer Learning for Historical Chinese Character Recognition","authors":"Junyang Cai, Liangrui Peng, Yejun Tang, Changsong Liu, Pengchao Li","doi":"10.1109/ICDAR.2019.00037","DOIUrl":"https://doi.org/10.1109/ICDAR.2019.00037","url":null,"abstract":"Historical Chinese character recognition faces problems including low image quality and lack of labeled training samples. We propose a generative adversarial network (GAN) based transfer learning method to ease these problems. The proposed TH-GAN architecture includes a discriminator and a generator. The network structure of the discriminator is based on a convolutional neural network (CNN). Inspired by Wasserstein GAN, the loss function of the discriminator aims to measure the probabilistic distribution distance of the generated images and the target images. The network structure of the generator is a CNN based encoder-decoder. The loss function of the generator aims to minimize the distribution distance between the real samples and the generated samples. In order to preserve the complex glyph structure of a historical Chinese character, a weighted mean squared error (MSE) criterion by incorporating both the edge and the skeleton information in the ground truth image is proposed as the weighted pixel loss in the generator. These loss functions are used for joint training of the discriminator and the generator. Experiments are conducted on two tasks to evaluate the performance of the proposed TH-GAN. The first task is carried out on style transfer mapping for multi-font printed traditional Chinese character samples. The second task is carried out on transfer learning for historical Chinese character samples by adding samples generated by TH-GAN. Experimental results show that the proposed TH-GAN is effective.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121331539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Zero Shot Learning Based Script Identification in the Wild 零射击学习基于脚本识别在野外

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00162

Prateek Keserwani, K. De, P. Roy, U. Pal

引用次数: 8

An Interactive and Generative Approach for Chinese Shanshui Painting Document 中国山水绘画文献的互动生成方法

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00136

Aven Le Zhou, Qiu-Feng Wang, Kaizhu Huang, C. Lo

引用次数: 9

DeepText: Detecting Text from the Wild with Multi-ASPP-Assembled DeepLab DeepText:使用多asp组装的DeepLab从野外检测文本

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00042

Qingqing Wang, W. Jia, Xiangjian He, Yue Lu, M. Blumenstein, Ye Huang, Shujing Lyu

引用次数: 1

ICDAR 2019 Competition on Harvesting Raw Tables from Infographics (CHART-Infographics) ICDAR 2019从信息图表中获取原始表格竞赛(CHART-Infographics)

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00203

Kenny Davila, B. Kota, S. Setlur, V. Govindaraju, Chris Tensmeyer, Sumit Shekhar, Ritwick Chaudhry

{"title":"ICDAR 2019 Competition on Harvesting Raw Tables from Infographics (CHART-Infographics)","authors":"Kenny Davila, B. Kota, S. Setlur, V. Govindaraju, Chris Tensmeyer, Sumit Shekhar, Ritwick Chaudhry","doi":"10.1109/ICDAR.2019.00203","DOIUrl":"https://doi.org/10.1109/ICDAR.2019.00203","url":null,"abstract":"This work summarizes the results of the first Competition on Harvesting Raw Tables from Infographics (ICDAR 2019 CHART-Infographics). The complex process of automatic chart recognition is divided into multiple tasks for the purpose of this competition, including Chart Image Classification (Task 1), Text Detection and Recognition (Task 2), Text Role Classification (Task 3), Axis Analysis (Task 4), Legend Analysis (Task 5), Plot Element Detection and Classification (Task 6.a), Data Extraction (Task 6.b), and End-to-End Data Extraction (Task 7). We provided a large synthetic training set and evaluated submitted systems using newly proposed metrics on both synthetic charts and manually-annotated real charts taken from scientific literature. A total of 8 groups registered for the competition out of which 5 submitted results for tasks 1-5. The results show that some tasks can be performed highly accurately on synthetic data, but all systems did not perform as well on real world charts. The data, annotation tools, and evaluation scripts have been publicly released for academic use.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134010434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Parameter-Free Table Detection Method 无参数表检测方法

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00079

Laiphangbam Melinda, C. Bhagvati

{"title":"Parameter-Free Table Detection Method","authors":"Laiphangbam Melinda, C. Bhagvati","doi":"10.1109/ICDAR.2019.00079","DOIUrl":"https://doi.org/10.1109/ICDAR.2019.00079","url":null,"abstract":"In this paper, we propose two parameter-free table detection methods: one for the closed tables and other for open tables. The unifying idea is multigaussian analysis. Multigaussian analysis of text height histograms classifies the document content into text and non-text blocks. Closed tables are classified as non-text and their identification from the non-text blocks is similar to many earlier methods that remove the separators. We do not need any parameters to identify rows and columns and discriminate them from text blocks because of multigaussian analysis. Open tables are initially classified as text blocks and are detected by extending the multigaussian analysis to the heights and widths of text blocks. The text-blocks are grouped into three categories by multigaussian analysis. These groups are used to classify table cells and distinguish them from text blocks. Table blocks are merged to obtain the table region. Evaluation on various Indic script newspapers and ICDAR2013 table competition dataset shows that our methods achieve more than 90% in table recognition. The strength of our algorithm is that it is a parameter-free approach and requires no training dataset.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"17 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134105087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Two Stream Deep Network for Document Image Classification 文档图像分类的二流深度网络

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00227

M. Asim, Muhammad Usman Ghani Khan, M. I. Malik, K. Razzaque, A. Dengel, Sheraz Ahmed

引用次数: 15

Article Segmentation in Digitised Newspapers with a 2D Markov Model 基于二维马尔可夫模型的数字化报纸文章分割

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00165

Andrew Naoum, J. Nothman, J. Curran

引用次数: 9

Unsupervised OCR Model Evaluation Using GAN 基于GAN的无监督OCR模型评估

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI: 10.1109/ICDAR.2019.00-42

Abhash Sinha, Martin Jenckel, S. S. Bukhari, A. Dengel

{"title":"Unsupervised OCR Model Evaluation Using GAN","authors":"Abhash Sinha, Martin Jenckel, S. S. Bukhari, A. Dengel","doi":"10.1109/ICDAR.2019.00-42","DOIUrl":"https://doi.org/10.1109/ICDAR.2019.00-42","url":null,"abstract":"Optical Character Recognition (OCR) has achieved its state-of-the-art performance with the use of Deep Learning for character recognition. Deep Learning techniques need large amount of data along with ground truth. Out of the available data, small portion of it has to be used for validation purpose as well. Preparing ground truth for historical documents is expensive and hence availability of data is of utmost concern. Jenckel et al. jenckel came up with an idea of using all the available data for training the OCR model and for the purpose of validation, they generated the input image from Softmax layer of the OCR model; using the decoder setup which can be used to compare with the original input image to validate the OCR model. In this paper, we have explored the possibilities of using Generative Adversial Networks (GANs) gan for generating the image directly from the text obtained from OCR model instead of using the Softmax layer which is not always accessible for all the Deep Learning based OCR models. Using text directly to generate the input image back gives us the advantage to use this pipeline for any OCR models even whose Softmax layer is not accessible. In the results section, we have shown that the current state of using GANs for unsupervised OCR model evaluation.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114848202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2