{"title":"Ground-Truth and Performance Evaluation for Page Layout Analysis of Born-Digital Documents","authors":"Xin Tao, Zhi Tang, Canhui Xu, Liangcai Gao","doi":"10.1109/DAS.2014.37","DOIUrl":"https://doi.org/10.1109/DAS.2014.37","url":null,"abstract":"In this paper, a new dataset is proposed for page layout analysis of born-digital documents. By extracting uniformly the document contents, an XML based data format is designed in terms of raw data and structure data. Utilizing a self-developed ground-truthing tool, a public dataset is constructed from diverse styles of document resources. With consideration of physical segmentation and logical labeling, automatic performance evaluation methods are adjusted to cope with different scenarios. The applications of the proposed dataset have shown that it is suitable for evaluating various layout analysis tasks.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125189137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Printer Identification Using Supervised Learning for Document Forgery Detection","authors":"Sarah Elkasrawi, F. Shafait","doi":"10.1109/DAS.2014.48","DOIUrl":"https://doi.org/10.1109/DAS.2014.48","url":null,"abstract":"Identifying the source printer of a document is important in forgery detection. The larger the number of documents to be investigated for forgery, the less time-efficient manual examination becomes. Assuming the document in question was scanned, the accuracy of automatic forgery detection depends on the scanning resolution. Low (100-200 dpi) and common (300-400 dpi) resolution scans have less distinctive features than high-end scanner resolution, whereas the former is more widespread in offices. In this paper, we propose a method to automatically identify source printers using common-resolution scans (400 dpi). Our method depends on distinctive noise produced by printers. Independent of the document content or size, each printer produces noise depending on its printing technique, brand and slight differences due to manufacturing imperfections. Experiments were carried out on a set of 400 documents of similar structure printed using 20 different printers. The documents were scanned at 400 dpi using the same scanner. Assuming constant settings of the printer, the overall accuracy of the classification was 76.75%.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129318694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maroua Mehri, V. C. Kieu, Mohamed Mhiri, P. Héroux, Petra Gomez-Krämer, M. Mahjoub, R. Mullot
{"title":"Robustness Assessment of Texture Features for the Segmentation of Ancient Documents","authors":"Maroua Mehri, V. C. Kieu, Mohamed Mhiri, P. Héroux, Petra Gomez-Krämer, M. Mahjoub, R. Mullot","doi":"10.1109/DAS.2014.22","DOIUrl":"https://doi.org/10.1109/DAS.2014.22","url":null,"abstract":"For the segmentation of ancient digitized document images, it has been shown that texture feature analysis is a consistent choice for meeting the need to segment a page layout under significant and various degradations. In addition, it has been proven that the texture-based approaches work effectively without hypothesis on the document structure, neither on the document model nor the typographical parameters. Thus, by investigating the use of texture as a tool for automatically segmenting images, we propose to search homogeneous and similar content regions by analyzing texture features based on a multiresolution analysis. The preliminary results show the effectiveness of the texture features extracted from the autocorrelation function, the Grey Level Co-occurrence Matrix (GLCM), and the Gabor filters. In order to assess the robustness of the proposed texture-based approaches, images under numerous degradation models are generated and two image enhancement algorithms (non-local means filtering and superpixel techniques) are evaluated by several accuracy metrics. This study shows the robustness of texture feature extraction for segmentation in the case of noise and the uselessness of a demising step.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126775938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dimosthenis Karatzas, Sergi Robles Mestre, L. G. I. Bigorda
{"title":"An on-line platform for ground truthing and performance evaluation of text extraction systems","authors":"Dimosthenis Karatzas, Sergi Robles Mestre, L. G. I. Bigorda","doi":"10.1109/DAS.2014.49","DOIUrl":"https://doi.org/10.1109/DAS.2014.49","url":null,"abstract":"This paper presents a set of on-line software tools for creating ground truth and calculating performance evaluation metrics for text extraction tasks such as localization, segmentation and recognition. The platform supports the definition of comprehensive ground truth information at different text representation levels while it offers centralised management and quality control of the ground truthing effort. It implements a range of state of the art performance evaluation algorithms and offers functionality for the definition of evaluation scenarios, on-line calculation of various performance metrics and visualisation of the results. The presented platform, which comprises the backbone of the ICDAR 2011 (challenge 1) and 2013 (challenges 1 and 2) Robust Reading competitions, is now made available for public use.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126900117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Empirical Evaluation of CRF-Based Bibliography Extraction from Reference Strings","authors":"Manabu Ohta, Daiki Arauchi, A. Takasu, J. Adachi","doi":"10.1109/DAS.2014.64","DOIUrl":"https://doi.org/10.1109/DAS.2014.64","url":null,"abstract":"This paper reports an empirical evaluation of a CRF-based bibliography parser we have developed for reference strings of research papers. The parser uses a conditional random field (CRF) to estimate the correct bibliographic label such as an author's name and a title for each token in a reference string. We applied the parser specifically designed for reference strings to three academic journals, an English one and two Japanese ones, published in Japan. Experiments showed (i) the parser correctly parsed from 90% to 94% of reference strings depending on the kinds of journals used and (ii) segmentation errors induced by tokenization considerably degraded the final parsing accuracies. This paper also discusses some future directions of the bibliography extraction based on a detailed analysis of the experiments.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127129606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Olivier Augereau, N. Journet, A. Vialard, J. Domenger
{"title":"Improving Classification of an Industrial Document Image Database by Combining Visual and Textual Features","authors":"Olivier Augereau, N. Journet, A. Vialard, J. Domenger","doi":"10.1109/DAS.2014.44","DOIUrl":"https://doi.org/10.1109/DAS.2014.44","url":null,"abstract":"The main contribution of this paper is a new method for classifying document images by combining textual features extracted with the Bag of Words (BoW) technique and visual features extracted with the Bag of Visual Words (BoVW) technique. The BoVW is widely used within the computer vision community for scene classification or object recognition but few applications for the classification of entire document images have been submitted. While previous attempts have been showing disappointing results by combining visual and textual features with the Borda-count technique, we're proposing here a combination through learning approach. Experiments conducted on a 1925 document image industrial database reveal that this fusion scheme significantly improves the classification performances. Our concluding contribution deals with the choosing and tuning of the BoW and/or BoVW techniques in an industrial context.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128526256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Text Detection Using Delaunay Triangulation in Video Sequence","authors":"Liang Wu, P. Shivakumara, Tong Lu, C. Tan","doi":"10.1109/DAS.2014.28","DOIUrl":"https://doi.org/10.1109/DAS.2014.28","url":null,"abstract":"Text detection and tracking in video sequence is gaining interest due to the challenges posed by low resolution and complex background. This paper proposes a new method for text detection by estimating trajectories between the corners of texts in video sequence over time. Each trajectory is considered as one node to form a graph for all trajectories and Delaunay triangulation is used to obtain edges to connect nodes of the graph. In order to identify the edges that represent text regions, we propose four pruning criteria based on spatial proximity, motion coherence, local appearance and canny rate. This results in several sub-graphs. Then we use depth first search to collect corner points, which essentially represent text candidates. False positives are eliminated using heuristics and missing trajectories will be obtained by tracking the corners in temporal frames. We test the method on different videos and evaluate the method in terms of recall, precision, f-measure with existing results. Experimental result shows that the proposed method is superior to existing method.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130584183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OCR Performance Prediction Using a Bag of Allographs and Support Vector Regression","authors":"T. Bhowmik, T. Paquet, N. Ragot","doi":"10.1109/DAS.2014.72","DOIUrl":"https://doi.org/10.1109/DAS.2014.72","url":null,"abstract":"In this paper, we describe a novel and simple technique for prediction of OCR results without using any OCR. The technique uses a bag of allographs to characterize textual components. Then a support vector regression (SVR) technique is used to build a predictor based on the bag of allographs. The performance of the system is evaluated on a corpus of historical documents. The proposed technique produces correct prediction of OCR results on training and test documents within the range of standard deviation of 4.18% and 6.54% respectively. The proposed system has been designed as a tool to assist selection of corpora in libraries and specify the typical performance that can be expected on the selection.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133916175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Philippine Barlas, Sébastien Adam, Clément Chatelain, T. Paquet
{"title":"A Typed and Handwritten Text Block Segmentation System for Heterogeneous and Complex Documents","authors":"Philippine Barlas, Sébastien Adam, Clément Chatelain, T. Paquet","doi":"10.1109/DAS.2014.39","DOIUrl":"https://doi.org/10.1109/DAS.2014.39","url":null,"abstract":"This paper presents a Document Image Analysis (DIA) system able to extract homogeneous typed and handwritten text regions from complex layout documents of various types. The method is based on two connected component classification stages that successively discriminate text/non text and typed/handwritten shapes, followed by an original block segmentation method based on white rectangles detection. We present the results obtained by the system during the first competition round of the MAURDOR campaign.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115344731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph Model Optimization Based Historical Chinese Character Segmentation Method","authors":"Jingning Ji, Liangrui Peng, Bohan Li","doi":"10.1109/DAS.2014.57","DOIUrl":"https://doi.org/10.1109/DAS.2014.57","url":null,"abstract":"Historical Chinese document recognition technology is important for digital library. However, historical Chinese character segmentation remains a difficult problem due to the complex structure of Chinese characters and various writing styles. This paper presents a novel method for historical Chinese character segmentation based on graph model. After a preliminary over-segmentation stage, the system applies a merging process. The candidate segmentation positions are denoted by the nodes of a graph, and the merging process is regarded as selecting an optimal path of the graph. The weight of edge in the graph is calculated by the cost function which considers geometric features and recognition confidence. Experimental results show that the proposed method is effective with a detection rate of 94.6% and an accuracy rate of 96.1% on a test set of practical historical Chinese document samples.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124639765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}