使用条件gan去除历史文档退化

International Conference on Pattern Recognition Applications and Methods Pub Date : 2019-02-19 DOI:10.5220/0007367701450154

Veeru Dumpala, Sheela Raju Kurupathi, S. S. Bukhari, A. Dengel

{"title":"使用条件gan去除历史文档退化","authors":"Veeru Dumpala, Sheela Raju Kurupathi, S. S. Bukhari, A. Dengel","doi":"10.5220/0007367701450154","DOIUrl":null,"url":null,"abstract":"One of the most crucial problem in document analysis and OCR pipeline is document binarization. Many traditional algorithms over the past few decades like Sauvola, Niblack, Otsu etc,. were used for binarization which gave insufficient results for historical texts with degradations. Recently many attempts have been made to solve binarization using deep learning approaches like Autoencoders, FCNs. However, these models do not generalize well to real world historical document images qualitatively. In this paper, we propose a model based on conditional GAN, well known for its high-resolution image synthesis. Here, the proposed model is used for image manipulation task which can remove different degradations in historical documents like stains, bleed-through and non-uniform shadings. The performance of the proposed model outperforms recent state-of-the-art models for document image binarization. We support our claims by benchmarking the proposed model on publicly available PHIBC 2012, DIBCO (2009-2017) and Palm Leaf datasets. The main objective of this paper is to illuminate the advantages of generative modeling and adversarial training for document image binarization in supervised setting which shows good generalization capabilities on different inter/intra class domain document images.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Removal of Historical Document Degradations using Conditional GANs\",\"authors\":\"Veeru Dumpala, Sheela Raju Kurupathi, S. S. Bukhari, A. Dengel\",\"doi\":\"10.5220/0007367701450154\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the most crucial problem in document analysis and OCR pipeline is document binarization. Many traditional algorithms over the past few decades like Sauvola, Niblack, Otsu etc,. were used for binarization which gave insufficient results for historical texts with degradations. Recently many attempts have been made to solve binarization using deep learning approaches like Autoencoders, FCNs. However, these models do not generalize well to real world historical document images qualitatively. In this paper, we propose a model based on conditional GAN, well known for its high-resolution image synthesis. Here, the proposed model is used for image manipulation task which can remove different degradations in historical documents like stains, bleed-through and non-uniform shadings. The performance of the proposed model outperforms recent state-of-the-art models for document image binarization. We support our claims by benchmarking the proposed model on publicly available PHIBC 2012, DIBCO (2009-2017) and Palm Leaf datasets. The main objective of this paper is to illuminate the advantages of generative modeling and adversarial training for document image binarization in supervised setting which shows good generalization capabilities on different inter/intra class domain document images.\",\"PeriodicalId\":410036,\"journal\":{\"name\":\"International Conference on Pattern Recognition Applications and Methods\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Pattern Recognition Applications and Methods\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5220/0007367701450154\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Pattern Recognition Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0007367701450154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

文档二值化是文档分析和OCR管道中最关键的问题之一。过去几十年的许多传统算法，如Sauvola、Niblack、Otsu等。用于二值化，对有退化的历史文本给出的结果不充分。最近，许多人尝试使用深度学习方法来解决二值化问题，比如自动编码器、fcn。然而，这些模型不能很好地定性地推广到真实世界的历史文献图像。在本文中，我们提出了一个基于条件GAN的模型，该模型以其高分辨率图像合成而闻名。本文将所提出的模型用于图像处理任务，可以去除历史文档中不同的退化现象，如污渍、透血和不均匀阴影。所提出的模型的性能优于最近最先进的文档图像二值化模型。我们通过在公开可用的PHIBC 2012、DIBCO(2009-2017)和Palm Leaf数据集上对拟议模型进行基准测试来支持我们的主张。本文的主要目的是阐明生成建模和对抗训练在监督环境下对文档图像二值化的优势，它在不同的类间/类内领域文档图像上显示出良好的泛化能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Removal of Historical Document Degradations using Conditional GANs

One of the most crucial problem in document analysis and OCR pipeline is document binarization. Many traditional algorithms over the past few decades like Sauvola, Niblack, Otsu etc,. were used for binarization which gave insufficient results for historical texts with degradations. Recently many attempts have been made to solve binarization using deep learning approaches like Autoencoders, FCNs. However, these models do not generalize well to real world historical document images qualitatively. In this paper, we propose a model based on conditional GAN, well known for its high-resolution image synthesis. Here, the proposed model is used for image manipulation task which can remove different degradations in historical documents like stains, bleed-through and non-uniform shadings. The performance of the proposed model outperforms recent state-of-the-art models for document image binarization. We support our claims by benchmarking the proposed model on publicly available PHIBC 2012, DIBCO (2009-2017) and Palm Leaf datasets. The main objective of this paper is to illuminate the advantages of generative modeling and adversarial training for document image binarization in supervised setting which shows good generalization capabilities on different inter/intra class domain document images.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Conference on Pattern Recognition Applications and Methods

自引率

0.00%

发文量