{"title":"Duplicate detection for symbolically compressed documents","authors":"Dar-Shyang Lee, J. Hull","doi":"10.1109/ICDAR.1999.791785","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791785","url":null,"abstract":"A new family of symbolic compression algorithms has recently been developed that includes the ongoing JBIG2 standardization effort as well as related commercial products. These techniques are specifically designed for binary document images. They cluster individual blobs in a document and store the sequence of occurrence of blobs and representative blob templates, hence the name symbolic compression. This paper describes a method for duplicate detection on symbolically compressed document images. It recognizes the text in an image by deciphering the sequence of occurrence of blobs in the compressed representation. We propose a Hidden Markov Model (HMM) method for solving such deciphering problems and suggest applications in multilingual document duplicate detection.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126322049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A two-step algorithm and its parallelization for the generation of minimum containing rectangles for document image segmentation","authors":"S. Sural, P. Das","doi":"10.1109/ICDAR.1999.791752","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791752","url":null,"abstract":"In document processing, segmentation is done to uniquely identify each foreground connected region of an image by specifying its minimum containing rectangle (MCR). MCR is the rectangle with minimum dimensions that completely encloses a geometric pattern. We present a two-step MCR detection algorithm and its parallelization method. The first step determines the boundary of each connected component in a document image. This reduces resource requirements and speeds up the subsequent rectangle detection step. The rectangle detection step determines MCRs of the connected components from the detected boundaries. A comparison is made between a single-step and the two-step approaches of MCR detection. Both the boundary detection and the rectangle detection steps are parallelized and implemented on transputers to reduce the total processing time.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"19 16","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113935492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Logical structure analysis of document images based on emergent computation","authors":"Yasuto Ishitani","doi":"10.1109/ICDAR.1999.791756","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791756","url":null,"abstract":"A new method for logical structure analysis of document images is proposed in this paper as the basis for a document reader which can extract logical information from various printed documents. The proposed system consists of five basic modules: typography analysis, object recognition, object segmentation, object grouping and object modification. Emergent computation, which is a key concept of artificial life, is adopted for the cooperative interaction among the modules in the system in order to achieve an effective and flexible behavior of the whole system. It has two principal advantages over other methods: adaptive system configuration for various and complex logical structures, and robust document analysis that is tolerant of erroneous feature detection.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114660660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Relaxation-based pattern matching using automatic differentiation for off-line character recognition","authors":"T. Nagasaki, T. Yanagida, M. Nakagawa","doi":"10.1109/ICDAR.1999.791766","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791766","url":null,"abstract":"The paper describes a relaxation based matching method for offline character recognition. This method employs elastic stroke models as standard character patterns. Pattern similarity between a standard and an input pattern is defined by fuzzy logic. The matching process is formalized as a maximization problem of the similarity and computed by the steepest descent technique. To implement this technique, we adopted automatic differentiation, which made it possible to calculate the partial derivatives of the target function automatically, only given the definition of that function. Results of computer experiments targeting 46 hiragana characters from the ETL8B database revealed a maximum recognition rate of 98.8% for 20 input sets when combining stroke springs with relative location springs.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115046577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new distinguishing algorithm of connected character image based on Fourier transform","authors":"Xiaoyan Zhu, Yifan Shi, Song Wang","doi":"10.1109/ICDAR.1999.791906","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791906","url":null,"abstract":"Segmentation is the most difficult problem in a handwritten character recognition system and often contributes major errors to its performance. To reach a balance of speed and accuracy, a filter distinguishing a connected image from an isolated image is required for multi-stage segmentation. The Fourier spectrum is promising in this problem. Since it is influenced by the stroke width, we propose a Fourier spectrum standardization method. Based on the standardized Fourier spectrum, a set of features and a fine-tuned criterion are presented to classify connected/isolated images. A theoretical analysis proves their rationality. Experimental results demonstrate that this criterion is better than other methods.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115596843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comparison of research and production architectures for check reading systems","authors":"Stewart Kelland, Slawomir Wesolkowski","doi":"10.1109/ICDAR.1999.791734","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791734","url":null,"abstract":"We have developed research and production architectures for the automatic processing of checks. The research architecture is flexible and allows rapid prototyping of image preprocessing, document analysis, recognition, and combination algorithms. The production architecture features high throughput, scalability, robustness, high availability, and low cost. Each architecture has features that make it attractive for its particular use. Results obtained using the production architecture in a live environment are shown. The benefits and drawbacks of both architectures are discussed.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"475 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115634046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MergeLayouts-overcoming faulty segmentations by a comprehensive voting of commercial OCR devices","authors":"Stefan Klink, T. Jäger","doi":"10.1109/ICDAR.1999.791805","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791805","url":null,"abstract":"In this paper we present a comprehensive voting approach, taking entire layouts obtained from commercial OCR devices as input. Such a layout comprises segments of three kinds: lines, words, and characters. By combining all attributes of a segment (e.g. recognized text, font height etc.), we attain a \"better\" layout, representing the original page layout as good as possible. The voting process itself is hierarchically organized, starting with the line segments. For each level, a search tree is spawn and all fellow segments (segments front different layouts which denote the same image area) are established. A heuristic search method is utilized which is guided by a similarity measure defined on segments. Deviations in the segmentation, as well as segmentation errors of individual commercial OCR devices, are compensated by an \"equalization module\".","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122440578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Preattentive reading and selective attention for document image analysis","authors":"C. Faure","doi":"10.1109/ICDAR.1999.791853","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791853","url":null,"abstract":"PixED (from Pixel to Electronic Document) is aimed at converting document images into structured electronic documents which can be read by a machine for information retrieval. The approach is based on the combination of perception and symbol reading which are the two processes involved when humans detect the organisation of a document. \"Preattentive reading\" denotes the physical segmentation related to perceptual organisation. \"Selective attention\" means that symbol reading is limited to specific sequences of symbols or to pre-attentively selected locations. An OCR provides the primary structured description of the document. PixED improves the quality of this description, completes the physical segmentation and adds a logical description. A distributed software architecture and an incremental strategy are defined to enable the integration of perception and symbol reading. The approach is tested on a set of documents composed of several pages which are gathered from proceedings of scientific conferences.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129871849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interactive approach to the extraction of logical structures from unformatted document images using a sub-structure model","authors":"M. Yamaoka, O. Iwaki, N. Babaguchi, T. Kitahashi","doi":"10.1109/ICDAR.1999.791755","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791755","url":null,"abstract":"Describes a new document analysis method for unformatted documents such as advertisements or catalogs. Conventional model-based approaches to the extraction of logical structures are hard to apply to advertisements or catalogs, because a model of a page can't be defined. However, these kinds of documents have similar configurations of the regions that represent each product, where a local model of a local layout and logical structures can be defined. This model, which we call a sub-structure model, can be used as a template to extract the logical structures from other regions that represent the same kinds of products. In proposed system, a sub-structure model is captured through an interactive process with a user. The system was tested on advertisements in Japanese computer magazines and the experiments show promising results.","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125563644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On-line handwritten formula recognition using hidden Markov models and context dependent graph grammars","authors":"A. Kosmala, G. Rigoll, S. Lavirotte, L. Pottier","doi":"10.1109/ICDAR.1999.791736","DOIUrl":"https://doi.org/10.1109/ICDAR.1999.791736","url":null,"abstract":"This paper presents an approach for the recognition of on-line handwritten mathematical expressions. The hidden Markov model (HMM) based system makes use of simultaneous segmentation and recognition capabilities, avoiding a crucial segmentation during pre-processing. With the segmentation and recognition results, obtained from the HMM recognizer it is possible to analyze and interpret the spatial two-dimensional arrangement of the symbols. We use a graph grammar approach for the structure recognition, also used in off-line recognition process, resulting in a general tree-structure of the underlying input-expression. The resulting constructed tree can be translated to any desired syntax (for example: Lisp, KT/sub E/X, and OpenMath).","PeriodicalId":130039,"journal":{"name":"Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130013518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}