{"title":"Content-level Annotation of Large Collection of Printed Document Images","authors":"Anand Kumar, C. V. Jawahar","doi":"10.1109/ICDAR.2007.89","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.89","url":null,"abstract":"A large annotated corpus is critical to the development of robust optical character recognizers (OCRs). However, creation of annotated corpora is a tedious task. It is laborious, especially when the annotation is at the character level. In this paper, we propose an efficient hierarchical approach for annotation of large collection of printed document images. We align document images with independently keyed-in text. The method is model-driven and is intended to annotate large collection of documents, scanned in three different resolutions, at character level. We employ an XML representation for storage of the annotation information. APIs are provided for access at content level for easy use in training and evaluation of OCRs and other document understanding tasks.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130407784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Efficient Thresholding Algorithm for Brazilian Bank Checks","authors":"C. Mello, B. Bezerra, C. Zanchettin, V. Macário","doi":"10.1109/ICDAR.2007.50","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.50","url":null,"abstract":"It is present herein an algorithm for thresholding images of bank checks. These images have complex background elements. Some of these patterns make very hard to distinguish between the text and the texture pattern defined by the bank. For the binarizing process, an adaptive global thresholding algorithm is proposed based on ROC curves and it is compared to several well-known algorithms. The images generated by the new algorithm achieved a hit rate of 97% for recognition of the CMC7 code.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121481540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Araujo, George D. C. Cavalcanti, E. C. B. C. Filho
{"title":"An Approach to Improve Accuracy Rate of On-line Signature Verification Systems of Different Sizes","authors":"R. Araujo, George D. C. Cavalcanti, E. C. B. C. Filho","doi":"10.1109/ICDAR.2007.46","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.46","url":null,"abstract":"This paper discusses the problem of size variation in on-line signature verification systems. The main idea of the article is to investigate the influence of the size variation in the feature extraction techniques and how this distortion can affect the final classification performance of the systems. In this study a new classification approach was suggested based on Kholmatov and Yanikoglu work in order to measure this performance. Besides that, a feature selection technique was applied in the description of the patterns with the purpose of over come the size variation problem. All the experiments were performed in a database constructed with signatures of three different sizes and skilled forgeries. This kind of study plays an important role in the implementation of systems that uses different signature sources.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121683877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Text Segmentation from Complex Background Using Sparse Representations","authors":"Wumo Pan, T. D. Bui, C. Suen","doi":"10.1109/ICDAR.2007.246","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.246","url":null,"abstract":"A novel text segmentation method from complex background is presented in this paper. The idea is inspired by the recent development in searching for the sparse signal representation among a family of over-complete atoms, which is called a dictionary. We assume that the image under investigation is composed of two components: the foreground text and the complex background. We further assume that the latter can be modeled as a piece-wise smooth function. Then we choose two dictionaries, where the first one gives sparse representation to one component and non-sparse representation to another while the second one does the opposite. By looking for the sparse representations in each dictionary, we can decompose the image into the two composing components. After that, text segmentation can be easily achieved by applying simple thresholding to the text component. Preliminary experiments show some promising results.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126299313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Table Recognition and Understanding from PDF Files","authors":"Tamir Hassan, Robert Baumgartner","doi":"10.1109/ICDAR.2007.241","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.241","url":null,"abstract":"We propose a flexible method for detecting and understanding tables in PDF files, which is not reliant upon one particular feature being present, for example ruling lines or indentations, and is therefore applicable to a wide variety of visual presentations. We describe the steps required in transforming the low-level PDF instructions into text segments, lines and boxes on a page. We propose three different classifications for published tables, and develop methods to detect these tables and correctly identify their respective rows and columns. We also explain how to recognize spanning rows and columns, and multi-line rows. Experimental results show that our algorithm is effective in converting a wide variety of tabular presentations into HTML for information extraction purposes.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115813773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic Document Logo Detection","authors":"Guangyu Zhu, D. Doermann","doi":"10.1109/ICDAR.2007.68","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.68","url":null,"abstract":"Automatic logo detection and recognition continues to be of great interest to the document retrieval community as it enables effective identification of the source of a document. In this paper, we propose a new approach to logo detection and extraction in document images that robustly classifies and precisely localizes logos using a boosting strategy across multiple image scales. At a coarse scale, a trained Fisher classifier performs initial classification using features from document context and connected components. Each logo candidate region is further classified at successively finer scales by a cascade of simple classifiers, which allows false alarms to be discarded and the detected region to be refined. Our approach is segmentation free and lay-out independent. We define a meaningful evaluation metric to measure the quality of logo detection using labeled groundtruth. We demonstrate the effectiveness of our approach using a large collection of real-world documents.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133832670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification of Non-Black Inks Using HSV Colour Space","authors":"Haritha Dasari, C. Bhagvati","doi":"10.1109/ICDAR.2007.138","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.138","url":null,"abstract":"An important problem in questioned document examination is detection of alterations done by inserting words or additional lines of text. In this paper, we present a statistical pattern recognition driven approach that views it as a two- class problem. Given two sample words, one of which is a suspected alteration, it is necessary to determine if the two belong to the same class or different classes. Our approach is defined in two stages. We start with a 11-dimensional vector that comprises colour features defined in HSV space and texture features. During the training phase, we derive within-class and between-class LI distance distributions and identify an optimal threshold that minimizes Type I and Type II errors. During the second or test phase, we take a pair of unkown samples and use the threshold value obtained from the training phase to decide if the two belong to the same class or distinct classes. Our experimental results involving more than 95000 pairs of word images show that the approach gives an accuracy of over 90% for gel and roller pens and an accuracy of 85% for ball pen writings.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133937777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vu Nguyen, M. Blumenstein, V. Muthukkumarasamy, G. Leedham
{"title":"Off-line Signature Verification Using Enhanced Modified Direction Features in Conjunction with Neural Classifiers and Support Vector Machines","authors":"Vu Nguyen, M. Blumenstein, V. Muthukkumarasamy, G. Leedham","doi":"10.1109/ICDAR.2007.192","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.192","url":null,"abstract":"As a biometric, signatures have been widely used to identify people. In the context of static image processing, the lack of dynamic information such as velocity, pressure and the direction and sequence of strokes has made the realization of accurate off-line signature verification systems more challenging as compared to their on-line counterparts. In this paper, we propose an effective method to perform off-line signature verification based on intelligent techniques. Structural features are extracted from the signature's contour using the modified direction feature (MDF) and its extended version: the Enhanced MDF (EMDF). Two neural network-based techniques and Support Vector Machines (SVMs) were investigated and compared for the process of signature verification. The classifiers were trained using genuine specimens and other randomly selected signatures taken from a publicly available database of 3840 genuine signatures from 160 volunteers and 4800 targeted forged signatures. A distinguishing error rate (DER) of 17.78% was obtained with the SVM whilst keeping the false acceptance rate for random forgeries (FARR) below 0.16%.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134322322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How to Reduce the Size of Bank Check Image Archive?","authors":"V. Shapiro","doi":"10.1109/ICDAR.2007.134","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.134","url":null,"abstract":"TIFF group 4 is the dominant format for storing bitonal bank check images in digital archives. Kept for at least seven years these images demand huge volume of disk space. The image size becomes a critical resource from the bandwidth perspective in the era of the widespread Internet banking and Image exchange between various financial institutions. This paper proposes an approach for lossy compression of TIFF group 4 images able to reducing the compressed size by an extra 30%. The approach consists of the two separate steps: generic preprocessing, applicable to any document, and the check context-sensitive step that is specific to the check image compression. Compared to the widely used resolution reduction method, the proposed approach preserves the essential information in the image, providing comparable compression rates.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132175157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Language Models for Handwritten Short Message Services","authors":"Emmanuel Prochasson, C. Viard-Gaudin, E. Morin","doi":"10.1109/ICDAR.2007.153","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.153","url":null,"abstract":"Handwriting is an alternative method for entering texts composing short message services. However, a whole new language features the texts which are produced. They include for instance abbreviations and other consonantal writing which sprung up for time saving and fashion. We have collected and processed a significant number of such handwriting SMS, and used various strategies to tackle this challenging area of handwriting recognition. We proposed to study more specifically three different phenomena: consonant skeleton, rebus, and phonetic writing. For each of them, we compare the rough results produced by a standard recognition system with those obtained when using a specific language model.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132695540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}