Region Proposal for Pattern Spotting in Historical Document Images

Sovann En, C. Petitjean, Stéphane Nicolas, L. Heutte, F. Jurie
{"title":"Region Proposal for Pattern Spotting in Historical Document Images","authors":"Sovann En, C. Petitjean, Stéphane Nicolas, L. Heutte, F. Jurie","doi":"10.1109/ICFHR.2016.0075","DOIUrl":null,"url":null,"abstract":"Pattern spotting consists in searching in a document image for the occurrences of a queried graphical object. The main challenge in pattern spotting is that the query image is generally small and the occurrences may be located at any random places in the image. Rather than exhaustively indexing all possible subwindows extracted from the document images, the common way is to rely on a segmentation or a document layout analysis to limit the search space. However, there is no segmentation nor document layout analysis technique reliable enough for historical document images. Region proposal, a technique used to generate a set of regions potentially containing an object, has contributed to many state of the art object detection systems recently. Although it is initially proposed for object detection, we will show that region proposal also offers promising results for document images, particularly in the case of pattern spotting. In this paper, we aim at investigating the use of region proposal to produce high quality subwindows to replace the usual document layout analysis step and the blind sliding windowing step. From experiments conducted on the DocExplore dataset, we show that region proposal generates a comparable number of subwindows while helping the system to achieve significant better results than the system built with commonly used layout analysis techniques.","PeriodicalId":194844,"journal":{"name":"2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR)","volume":"2009 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFHR.2016.0075","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Pattern spotting consists in searching in a document image for the occurrences of a queried graphical object. The main challenge in pattern spotting is that the query image is generally small and the occurrences may be located at any random places in the image. Rather than exhaustively indexing all possible subwindows extracted from the document images, the common way is to rely on a segmentation or a document layout analysis to limit the search space. However, there is no segmentation nor document layout analysis technique reliable enough for historical document images. Region proposal, a technique used to generate a set of regions potentially containing an object, has contributed to many state of the art object detection systems recently. Although it is initially proposed for object detection, we will show that region proposal also offers promising results for document images, particularly in the case of pattern spotting. In this paper, we aim at investigating the use of region proposal to produce high quality subwindows to replace the usual document layout analysis step and the blind sliding windowing step. From experiments conducted on the DocExplore dataset, we show that region proposal generates a comparable number of subwindows while helping the system to achieve significant better results than the system built with commonly used layout analysis techniques.
历史文献图像模式识别的区域建议
模式定位包括在文档图像中搜索所查询图形对象的出现情况。模式识别的主要挑战是查询图像通常很小,并且出现的事件可能位于图像中的任何随机位置。常用的方法不是对从文档图像中提取的所有可能的子窗口进行详尽的索引,而是依靠分割或文档布局分析来限制搜索空间。然而,对于历史文档图像,目前还没有足够可靠的分割和文档布局分析技术。区域建议是一种用于生成一组可能包含一个物体的区域的技术,近年来为许多最先进的物体检测系统做出了贡献。虽然它最初是为目标检测提出的,但我们将展示区域建议也为文档图像提供了有希望的结果,特别是在模式识别的情况下。在本文中,我们旨在研究使用区域建议来产生高质量的子窗口,以取代通常的文档布局分析步骤和盲目滑动窗口步骤。从在DocExplore数据集上进行的实验中,我们发现区域建议生成了相当数量的子窗口,同时帮助系统获得比使用常用布局分析技术构建的系统更好的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信