Sovann En, C. Petitjean, Stéphane Nicolas, L. Heutte, F. Jurie
{"title":"Region Proposal for Pattern Spotting in Historical Document Images","authors":"Sovann En, C. Petitjean, Stéphane Nicolas, L. Heutte, F. Jurie","doi":"10.1109/ICFHR.2016.0075","DOIUrl":null,"url":null,"abstract":"Pattern spotting consists in searching in a document image for the occurrences of a queried graphical object. The main challenge in pattern spotting is that the query image is generally small and the occurrences may be located at any random places in the image. Rather than exhaustively indexing all possible subwindows extracted from the document images, the common way is to rely on a segmentation or a document layout analysis to limit the search space. However, there is no segmentation nor document layout analysis technique reliable enough for historical document images. Region proposal, a technique used to generate a set of regions potentially containing an object, has contributed to many state of the art object detection systems recently. Although it is initially proposed for object detection, we will show that region proposal also offers promising results for document images, particularly in the case of pattern spotting. In this paper, we aim at investigating the use of region proposal to produce high quality subwindows to replace the usual document layout analysis step and the blind sliding windowing step. From experiments conducted on the DocExplore dataset, we show that region proposal generates a comparable number of subwindows while helping the system to achieve significant better results than the system built with commonly used layout analysis techniques.","PeriodicalId":194844,"journal":{"name":"2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR)","volume":"2009 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFHR.2016.0075","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Pattern spotting consists in searching in a document image for the occurrences of a queried graphical object. The main challenge in pattern spotting is that the query image is generally small and the occurrences may be located at any random places in the image. Rather than exhaustively indexing all possible subwindows extracted from the document images, the common way is to rely on a segmentation or a document layout analysis to limit the search space. However, there is no segmentation nor document layout analysis technique reliable enough for historical document images. Region proposal, a technique used to generate a set of regions potentially containing an object, has contributed to many state of the art object detection systems recently. Although it is initially proposed for object detection, we will show that region proposal also offers promising results for document images, particularly in the case of pattern spotting. In this paper, we aim at investigating the use of region proposal to produce high quality subwindows to replace the usual document layout analysis step and the blind sliding windowing step. From experiments conducted on the DocExplore dataset, we show that region proposal generates a comparable number of subwindows while helping the system to achieve significant better results than the system built with commonly used layout analysis techniques.