Semantic Text Detection in Born-Digital Images via Fully Convolutional Networks

Nibal Nayef, J. Ogier
{"title":"Semantic Text Detection in Born-Digital Images via Fully Convolutional Networks","authors":"Nibal Nayef, J. Ogier","doi":"10.1109/ICDAR.2017.145","DOIUrl":null,"url":null,"abstract":"Traditional layout analysis methods cannot be easily adapted to born-digital images which carry properties from both regular document images and natural scene images. One layout approach for analyzing born-digital images is to separate the text layer from the graphics layer before further analyzing any of them. In this paper, we propose a method for detecting text regions in such images by casting the detection problem as a semantic object segmentation problem. The text classification is done in a holistic approach using fully convolutional networks where the full image is fed as input to the network and the output is a pixel heat map of the same input image size. This solves the problem of low resolution images, and the variability of text scale within one image. It also eliminates the need for finding interest points, candidate text locations or low level components. The experimental evaluation of our method on the ICDAR 2013 dataset shows that our method outperforms state-of-the-art methods. The detected text regions also allow flexibility to later apply methods for finding text components at character, word or textline levels in different orientations.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2017.145","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Traditional layout analysis methods cannot be easily adapted to born-digital images which carry properties from both regular document images and natural scene images. One layout approach for analyzing born-digital images is to separate the text layer from the graphics layer before further analyzing any of them. In this paper, we propose a method for detecting text regions in such images by casting the detection problem as a semantic object segmentation problem. The text classification is done in a holistic approach using fully convolutional networks where the full image is fed as input to the network and the output is a pixel heat map of the same input image size. This solves the problem of low resolution images, and the variability of text scale within one image. It also eliminates the need for finding interest points, candidate text locations or low level components. The experimental evaluation of our method on the ICDAR 2013 dataset shows that our method outperforms state-of-the-art methods. The detected text regions also allow flexibility to later apply methods for finding text components at character, word or textline levels in different orientations.
基于全卷积网络的出生数字图像语义文本检测
传统的版面分析方法不能很好地适应同时具有常规文档图像和自然场景图像特性的非数字图像。分析原生数字图像的一种布局方法是在进一步分析它们之前将文本层从图形层中分离出来。在本文中,我们提出了一种检测图像文本区域的方法,将检测问题转换为语义对象分割问题。文本分类是使用全卷积网络的整体方法完成的,其中将完整图像作为输入输入到网络中,输出是相同输入图像大小的像素热图。这就解决了图像分辨率低的问题,以及单幅图像内文本尺度的可变性。它还消除了查找兴趣点、候选文本位置或低级组件的需要。我们的方法在ICDAR 2013数据集上的实验评估表明,我们的方法优于最先进的方法。检测到的文本区域还允许以后灵活地应用在不同方向的字符、单词或文本行级别上查找文本组件的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信