基于自然卷积和多尺度特征解码器的历史文献文本二值化

2019 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2019-12-01 DOI:10.1109/DICTA47822.2019.8946108

Hanif Rasyidi, S. Khan

{"title":"基于自然卷积和多尺度特征解码器的历史文献文本二值化","authors":"Hanif Rasyidi, S. Khan","doi":"10.1109/DICTA47822.2019.8946108","DOIUrl":null,"url":null,"abstract":"This paper presents a segmentation-based binarization model to extract text information from the historical document using convolutional neural networks. The proposed method uses atrous convolution feature extraction to learn useful text pattern from the document without making a significant reduction on the spatial size of the image. The model then combines the extracted feature using a multi-scale decoder to construct a binary image that contains only text information from the document. We train our model using a series of DIBCO competition datasets and compare the results with the existing text binarization methods as well as a state-of-the-art object segmentation model. The experiment results on the H-DIBCO 2016 dataset show that our method has an excellent performance on the pseudo F-Score metric that surpasses the result of various existing methods.","PeriodicalId":6696,"journal":{"name":"2019 Digital Image Computing: Techniques and Applications (DICTA)","volume":"163 1","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Historical Document Text Binarization using Atrous Convolution and Multi-Scale Feature Decoder\",\"authors\":\"Hanif Rasyidi, S. Khan\",\"doi\":\"10.1109/DICTA47822.2019.8946108\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a segmentation-based binarization model to extract text information from the historical document using convolutional neural networks. The proposed method uses atrous convolution feature extraction to learn useful text pattern from the document without making a significant reduction on the spatial size of the image. The model then combines the extracted feature using a multi-scale decoder to construct a binary image that contains only text information from the document. We train our model using a series of DIBCO competition datasets and compare the results with the existing text binarization methods as well as a state-of-the-art object segmentation model. The experiment results on the H-DIBCO 2016 dataset show that our method has an excellent performance on the pseudo F-Score metric that surpasses the result of various existing methods.\",\"PeriodicalId\":6696,\"journal\":{\"name\":\"2019 Digital Image Computing: Techniques and Applications (DICTA)\",\"volume\":\"163 1\",\"pages\":\"1-8\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 Digital Image Computing: Techniques and Applications (DICTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DICTA47822.2019.8946108\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA47822.2019.8946108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

本文提出了一种基于分割的二值化模型，利用卷积神经网络从历史文档中提取文本信息。该方法在不显著减小图像空间大小的情况下，利用亚历克斯卷积特征提取从文档中学习有用的文本模式。然后，该模型使用多尺度解码器组合提取的特征，以构建仅包含文档文本信息的二值图像。我们使用一系列DIBCO竞争数据集训练我们的模型，并将结果与现有的文本二值化方法以及最先进的目标分割模型进行比较。在H-DIBCO 2016数据集上的实验结果表明，我们的方法在伪F-Score指标上的性能优于现有的各种方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Historical Document Text Binarization using Atrous Convolution and Multi-Scale Feature Decoder

This paper presents a segmentation-based binarization model to extract text information from the historical document using convolutional neural networks. The proposed method uses atrous convolution feature extraction to learn useful text pattern from the document without making a significant reduction on the spatial size of the image. The model then combines the extracted feature using a multi-scale decoder to construct a binary image that contains only text information from the document. We train our model using a series of DIBCO competition datasets and compare the results with the existing text binarization methods as well as a state-of-the-art object segmentation model. The experiment results on the H-DIBCO 2016 dataset show that our method has an excellent performance on the pseudo F-Score metric that surpasses the result of various existing methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 Digital Image Computing: Techniques and Applications (DICTA)

自引率

0.00%

发文量