{"title":"PDF-TREX: An Approach for Recognizing and Extracting Tables from PDF Documents","authors":"Ermelinda Oro, M. Ruffolo","doi":"10.1109/ICDAR.2009.12","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.12","url":null,"abstract":"This paper presents PDF-TREX, an heuristic approach for table recognition and extraction from PDF documents.The heuristics starts from an initial set of basic content elements and aligns and groups them, in bottom-up way by considering only their spatial features, in order to identify tabular arrangements of information. The scope of the approach is to recognize tables contained in PDF documents as a 2-dimensional grid on a Cartesian plane and extract them as a set of cells equipped by 2-dimensional coordinates. Experiments, carried out on a dataset composed of tables contained in documents coming from different domains, shows that the approach is well performing in recognizing table cells.The approach aims at improving PDF document annotation and information extraction by providing an output that can be further processed for understanding table and document contents.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128798973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online Text-independent Writer Identification Based on Temporal Sequence and Shape Codes","authors":"Bangyu Li, T. Tan","doi":"10.1109/ICDAR.2009.26","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.26","url":null,"abstract":"In this paper we present a novel method for online text-independent writer identification. Most of the existing writer identification techniques require the data to be from a specific text which is not applicable to cases where such text is not available, such as in criminal justice systems when text documents with different content need to be compared. Text-independent approaches often require a large amount of data to be confident of good results. We propose temporal sequence and shape codes to encode online handwriting. Temporal sequence codes (TSC) are to characterize trajectory in speed and pressure change in writing, and shape codes (SC) are to characterize direction of trajectory in writing handwriting. For TSC, we use two different codes to encode speed and pressure to codebook: stroke temporal sequence codes (STSC) and neighbor temporal sequence codes (NTSC). At identification stage, we implement decision and fusion strategy to identify writer. Experimental results show that our proposed method can improve the identification accuracy with a small number of characters. Moreover, we find that the proposed method is even effective for cross-language (English & Chinese) writer identification.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124569966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Character-Structure-Guided Approach to Estimating Possible Orientations of a Rotated Isolated Online Handwritten Chinese Character","authors":"Tingting He, Qiang Huo","doi":"10.1109/ICDAR.2009.84","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.84","url":null,"abstract":"This paper presents a character-structure-guided approach to estimating possible orientations of a rotated isolated online handwritten Chinese character. Using the estimated orientations, the original distorted sample can be transformed to a normal position, which can be recognized more accurately by using a classifier trained from normal-position samples. The effectiveness of this approach is demonstrated by recognizing rotated samples generated artificially from the popular Nakayosi and Kuchibue Japanese character databases, with average recognition accuracies of 96.05%, 97.35% and 99.13% on top-6, top-12, and top-100 candidates, respectively.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121335276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Embedded Bernoulli Mixture HMMs for Handwritten Word Recognition","authors":"Adrià Giménez, Alfons Juan-Císcar","doi":"10.1109/ICDAR.2009.66","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.66","url":null,"abstract":"Hidden Markov Models (HMMs) are now widely used in off-line handwritten word recognition. As in speech recognition, they are usually built from shared, embedded HMMs at symbol level, in which state-conditional probability density functions are modelled with Gaussian mixtures. In contrast to speech recognition, however, it is unclear which kind of real-valued features should be used and, indeed, very different features sets are in use today. In this paper, we propose to by-pass feature extraction and directly fed columns of raw, binary image pixels into embedded Bernoulli mixture HMMs, that is, embedded HMMs in which the emission probabilities are modelled with Bernoulli mixtures. The idea is to ensure that no discriminative information is filtered out during feature extraction, which in some sense is integrated into the recognition model. Empirical results are reported in which similar results are obtained with both Bernoulli and Gaussian mixtures, though Bernoulli mixtures are much simpler.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114208439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning on the Fly: Font-Free Approaches to Difficult OCR Problems","authors":"Andrew Kae, E. Learned-Miller","doi":"10.1109/ICDAR.2009.260","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.260","url":null,"abstract":"Despite ubiquitous claims that optical character recognition (OCR) is a \"solved problem,'' many categories of documents continue to break modern OCR software such as documents with moderate degradation or unusual fonts. Many approaches rely on pre-computed or stored character models, but these are vulnerable to cases when the font of a particular document was not part of the training set, or when there is so much noise in a document that the font model becomes weak. To address these difficult cases, we present a form of iterative contextual modeling that learns character models directly from the document it is trying to recognize. We use these learned models both to segment the characters and to recognize them in an incremental, iterative process. We present results comparable to those of a commercial OCR system on a subset of characters from a difficult test document.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116473688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ICDAR 2009 Handwritten Farsi/Arabic Character Recognition Competition","authors":"S. Mozaffari, Hadi Soltanizadeh","doi":"10.1109/ICDAR.2009.283","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.283","url":null,"abstract":"In recent years, the recognition of Farsi and Arabic handwriting is drawing increasing attention. This paper describes the result of the ICDAR 2009 competition for handwritten Farsi/Arabic character recognition. To evaluate the submitted systems, we used large datasets containing both binary and gray-scale images. Many different groups downloaded the training sets; however, finally 4 systems successfully participated in the competition. The systems were tested on two known databases and one unknown dataset. Due to the similarity between some digits and characters in Farsi and Arabic, each recognizer was tested for digit and character sets separately. For benchmarking, only the recognition rates, as the most important characteristic, are considered. Since participants used different software and even operating systems, the relative recognition speed is not compared in this competition.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127599550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Feedback-Based Multi-Classifier System","authors":"G. Pirlo, C. A. Trullo, D. Impedovo","doi":"10.1109/ICDAR.2009.75","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.75","url":null,"abstract":"Multi-classifier approach is a widespread strategy used in many difficult classification problems.Traditionally, in a multi-classifier approach, a classification decision based on the combination of a multitude of classifiers is expected to outperform the decisions of each individual classifier. Therefore, in a multi-classifier systems, the potential of the whole set of classifiers is only exploited at the level of the final decision, in which the contributions of all classifiers is used by combining their individual decisions.This paper shows a feed-back based multi-classifier system in which the multi-classifier approach is used not only for providing the final decision, but also for improving the performance of the individual classifiers,by means of a closed-loop strategy.The experimental tests have been carried out in the field of hand-written numeral recognition. The result demonstrates the effectiveness of the proposed approach and its superiority with respect to traditional approach.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127744980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Rejection Measurement in Handwritten Numeral Recognition Based on Linear Discriminant Analysis","authors":"C. He, L. Lam, C. Suen","doi":"10.1109/ICDAR.2009.89","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.89","url":null,"abstract":"This paper presents a Linear Discriminant Analysis based Measurement (LDAM) on the output from classifiers as a criterion to reject the patterns which cannot be classified with high reliability. This is important in applications (such as in processing of financial documents) where errors can be very costly and therefore less tolerable than rejections. To implement the rejection, which can be considered to be a two-class problem of accepting the classification result or otherwise, Linear Discriminant Analysis (LDA) is used to determine the rejection threshold at a new approach. LDAM is designed to take into consideration the confidence values of the classifier outputs & the relations between them, and it is an improvement over traditional rejection measurements such as First Rank Measurement (FRM) and First Two Ranks Measurement (FTRM). Experiments are conducted on the CENPARMI Arabic Isolated Numerals Database. The results show that LDAM is more effective, and it can achieve a higher reliability while achieving a high recognition rate.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126271004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fisher Kernels for Handwritten Word-spotting","authors":"F. Perronnin, José A. Rodríguez-Serrano","doi":"10.1109/ICDAR.2009.16","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.16","url":null,"abstract":"The Fisher kernel is a generic framework which combines the benefits of generative and discriminative approaches to pattern classification. In this contribution, we propose to apply this framework to handwritten word-spotting. Given a word image and a keyword generative model, the idea is to generate a vector which describes how the parameters of the keyword model should be modified to best fit the word image.This vector can then be used as the input of a discriminative classifier. We compare the performance of the proposed approach with that of a generative baseline on a challenging real-world dataset of customer letters. When the kernel used by the classifier is linear, the performance improvement is marginal but the proposed system is approximately 15 times faster than the baseline. If we use a non-linear kernel devised for this task, we obtain a 15% relative reduction of the error but the detector is approximately 15 times slower.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125426379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Disease-Specific Extraction of Text from Cardiac Echo Videos for Decision Support","authors":"T. Syeda-Mahmood, D. Beymer, A. Amir","doi":"10.1109/ICDAR.2009.269","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.269","url":null,"abstract":"Echo videos are an important modality for cardiac decision support. In addition to describing the shape and motion of the heart, they capture important diagnostic measurements as textual feature-value pairs that are good indicators of the underlying disease. In this paper, we describe reliable extraction of such textual information through selective image processing and region extraction prior to using an OCR engine. We then use tabular layout analysis rules to recover measurement attribute value pairs from the recognized text in videos. The measurement feature-value pairs are used to retrieve matching videos from a database. A ranked list of matching diseases is then obtained through collaborative filtering. Results are demonstrated on a large echo video database of patients with various diseases.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125639567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}