Proceedings of Sixth International Conference on Document Analysis and Recognition最新文献

筛选
英文 中文
Training with positive and negative data samples: effects on a classifier for hand-drawn geometric shapes 正负数据样本训练:对手绘几何形状分类器的影响
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953939
Hanaa Barakat, D. Blostein
{"title":"Training with positive and negative data samples: effects on a classifier for hand-drawn geometric shapes","authors":"Hanaa Barakat, D. Blostein","doi":"10.1109/ICDAR.2001.953939","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953939","url":null,"abstract":"It is quite common in document analysis and symbol recognition to rely on a priori knowledge about the nature of the document in order to locate candidate symbols. It is desirable, but less common, for a segmentation procedure to rely on \"a posteriori\" feedback from a non-human-guided process to adjust for segmentation errors. For this method to succeed, the feedback must come from a reliable classifier (one that is able to reject negative symbols including miss-segmented symbols). This paper examines the use of positive and negative training data on a nearest-neighbour classifier for hand-drawn geometric shapes. We explore the issues involved in the development of a reliable classifier using this method, and we discuss the trade-off between reliability and correctness.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124948498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Binarising camera images for OCR 二值化相机图像的OCR
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953754
M. Seeger, C. Dance
{"title":"Binarising camera images for OCR","authors":"M. Seeger, C. Dance","doi":"10.1109/ICDAR.2001.953754","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953754","url":null,"abstract":"We describe a binarisation method designed specifically for OCR of low quality camera images: background surface thresholding or BST. This method is robust to lighting variations and produces images with very little noise and consistent stroke width. BST computes a \"surface\" of background intensities at every point in the image and performs adaptive thresholding based on this result. The surface is estimated by identifying regions of low-resolution text and interpolating neighbouring background intensities into these regions. The final threshold is a combination of this surface and a global offset. According to our evaluation BST produces considerably fewer OCR errors than Niblack's local average method while also being more runtime efficient.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124997249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 75
On-line recognition of UML diagrams UML图的在线识别
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953813
E. Lank, Jeb S. Thorley, Sean Chen, D. Blostein
{"title":"On-line recognition of UML diagrams","authors":"E. Lank, Jeb S. Thorley, Sean Chen, D. Blostein","doi":"10.1109/ICDAR.2001.953813","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953813","url":null,"abstract":"Unified Modeling Language (UML) diagrams are widely used by software engineers to describe the structure of software systems. Early in the software design cycle, software engineers informally sketch initial UML diagrams on paper or whiteboards. The information provided by these UML diagrams needs to be made available to computer assisted software engineering (CASE) tools. In order to smooth this transition from paper to electronic form, we have developed an online recognition system for UML diagrams. The system accepts input from an electronic whiteboard, a data tablet or a mouse. Efforts have been made to separate the domain-independent and domain-specific parts of the recognition system. The kernel of the system is retargetable, providing a general front end for online recognition of any glyph-based diagram notation. The kernel is extended with UML-specific routines for segmentation, recognition of glyphs, and recognition of glyph relationships.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115996551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Online recognition of sketched electrical diagrams 草图电气图的在线识别
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953832
Jean-Philippe Valois, Myriam Côté, M. Cheriet
{"title":"Online recognition of sketched electrical diagrams","authors":"Jean-Philippe Valois, Myriam Côté, M. Cheriet","doi":"10.1109/ICDAR.2001.953832","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953832","url":null,"abstract":"In this paper, a model-based scheme for recognizing and beautifying online hand-drawn sketches of electric diagrams is presented. The system uses a structural and topological relations matching mechanism that allows scale, translation, rotation invariant recognition. A simple prototype was developed and preliminary experimental results show how this technique, although simple, is efficient in recognizing such sketches.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128242996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
A distributed scheme for lexicon-driven handwritten word recognition and its application to large vocabulary problems 词典驱动手写词识别的分布式方案及其在大词汇量问题中的应用
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953872
Alessandro Lameiras Koerich, R. Sabourin, C. Suen
{"title":"A distributed scheme for lexicon-driven handwritten word recognition and its application to large vocabulary problems","authors":"Alessandro Lameiras Koerich, R. Sabourin, C. Suen","doi":"10.1109/ICDAR.2001.953872","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953872","url":null,"abstract":"Many offline handwritten word recognition systems have been proposed since the early nineties. Most systems reported high recognition rates, however, they overlooked a very important factor in the process: speed factor. The authors explore the potential for speeding up an offline handwritten word recognition system via concurrency. The goal of the system is to achieve both full accuracy and high speed when taking into account large vocabularies. This was accomplished by integrating the recognition process with multiprocessing and distributed computing concepts. Experimental results showed that the multiprocessing environment is very promising in enhancing a sequential offline handwritten word recognition system performance.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130667965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Advanced character recognition 6610 高级字符识别6610
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953744
G. Nagy
{"title":"Advanced character recognition 6610","authors":"G. Nagy","doi":"10.1109/ICDAR.2001.953744","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953744","url":null,"abstract":"ECSE 6610 Advanced Character Recognition. Principles and practice of the recognition of isolated or connected typeset, hand-printed, and cursive characters. Review of optical digitization, supervised and unsupervised estimation of classifier parameters, bias and variance, expectation maximization, the curse of dimensionality. Advanced classification techniques including classifier combinations, support vector machines, hidden Markov methods, styles, language context, adaptation, segmentation-free classifiers, indirect symbolic correlation. Prereq: ECSE 2610, Probability, Linear Algebra. Spring term annually.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123354202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An a priori indicator of the discrimination power of discrete hidden Markov models 离散隐马尔可夫模型判别能力的先验指标
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953812
Frédéric Grandidier, R. Sabourin, M. Gilloux, C. Suen
{"title":"An a priori indicator of the discrimination power of discrete hidden Markov models","authors":"Frédéric Grandidier, R. Sabourin, M. Gilloux, C. Suen","doi":"10.1109/ICDAR.2001.953812","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953812","url":null,"abstract":"During the development of a hidden Markov model based handwriting recognition system, the testing phase takes a non-negligible amount of computation time. This is especially true for real application where the lexicon size is large. In order to shorten the development process, we propose an indicator of the system discrimination power. This indicator is calculated during training and its final value is obtained at the end of the training phase, without more calculation. Its definition consists of a modification of the observation probability of the validation corpus by the trained system. Some experiments were carried out and the results show clearly the correlation between this indicator and recognition rates.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123497685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Applying fast segmentation techniques at a binary image represented by a set of non-overlapping blocks 对一组非重叠块表示的二值图像应用快速分割技术
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953965
B. Gatos, N. Papamarkos
{"title":"Applying fast segmentation techniques at a binary image represented by a set of non-overlapping blocks","authors":"B. Gatos, N. Papamarkos","doi":"10.1109/ICDAR.2001.953965","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953965","url":null,"abstract":"Run length smoothing algorithm (RLSA) and projection profiles are among the fundamental algorithms in binary image processing, mainly used for segmentation of monochrome images. In this paper, fast RLSA and projection profiles are applied to binary images represented by a set of nonoverlapping rectangular blocks. The representation of binary images using rectangular blocks as primitives has been used with great success for several image processing tasks, such as image compression, Hough transform fast implementation and skeletonization. We show that this representation can be applied with great success for fast RLSA application and fast projection profiles evaluation. The experimental results demonstrate that starting from a block represented binary image we can apply RLSA and evaluate projection profiles in significant less CPU time. The average time gain is recorded at 60% and 88%, respectively.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121422132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A real-world evaluation of a generic document recognition method applied to a military form of the 19th century 对19世纪军事形式的通用文件识别方法的实际评估
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953894
Bertrand Coüasnon, L. Pasquer
{"title":"A real-world evaluation of a generic document recognition method applied to a military form of the 19th century","authors":"Bertrand Coüasnon, L. Pasquer","doi":"10.1109/ICDAR.2001.953894","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953894","url":null,"abstract":"In this paper we present a real-world evaluation of DMOS, a new generic document recognition method. This method uses a new grammatical formalism (EPF) and an associated parser able to introduce context in segmentation. We have implemented this DMOS method to build an automatic generator of structured document recognition systems. We already produced three recognition systems by only changing the EPF grammar: one on musical scores, one on mathematical formulae and one on recursive table structures. We present here a specific light grammar to automatically recognize quite damaged 19th century military forms. The quality of those forms is far from perfect: table lines are not well printed, paper is so thin that there are transparency problems (the forms are two-sided) but the biggest problem comes from small paper sheets hiding part of the structure. The evaluation of this system has been made onto 5268 images and the results show that the system did not make any mistake. Moreover it can recognize the entire structure in 97.2% of the forms (the other 2.8% are automatically set apart).","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"10 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113962157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Synthetic data for Arabic OCR system development 阿拉伯语OCR系统开发的综合数据
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953967
V. Märgner, M. Pechwitz
{"title":"Synthetic data for Arabic OCR system development","authors":"V. Märgner, M. Pechwitz","doi":"10.1109/ICDAR.2001.953967","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953967","url":null,"abstract":"A system for the automatic generation of synthetic databases for the development or evaluation of Arabic word or text recognition systems (Arabic OCR) is presented. The proposed system works without any scanning of printed paper. Firstly Arabic text has to be typeset using a standard typesetting system. Secondly a noise-free bitmap of the document and the corresponding ground truth (GT) is automatically generated. Finally, an image distortion can be superimposed to the character or word image to simulate the expected real world noise of the intended application. All necessary modules are presented together with some examples. Special problems caused by specific features of Arabic, such as printing from right to left, many diacritical points, variation in the height of characters, and changes in the relative position to the writing line, are suggested. The synthetic data set was used to train and test a recognition system based on hidden Markov model (HMM), which was originally developed for German cursive script, for Arabic printed words. Recognition results with different synthetic data sets are presented.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"51 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114009550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信