Proceedings of Sixth International Conference on Document Analysis and Recognition最新文献

筛选
英文 中文
How conditional independence assumption affects handwritten character segmentation 条件独立假设如何影响手写字符分割
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953792
M. Maragoudakis, E. Kavallieratou, N. Fakotakis, G. Kokkinakis
{"title":"How conditional independence assumption affects handwritten character segmentation","authors":"M. Maragoudakis, E. Kavallieratou, N. Fakotakis, G. Kokkinakis","doi":"10.1109/ICDAR.2001.953792","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953792","url":null,"abstract":"This paper deals with the use of Bayesian Belief Networks in order to improve the accuracy and training time of character segmentation for unconstrained handwritten text. Comparative experimental results have been evaluated against Naive Bayes classification, which is based on the assumption of the independence of the parameters and two additional previous commonly used methods. Results have depicted that obtaining the inferential dependencies of the training data, could lead to the reduction of the required training time and size by a factor of 55%. Moreover, the achieved accuracy in detecting segment boundaries exceeds 86% whereas limited training data are proved to endow with very satisfactory results.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126975769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Applying the T-Recs table recognition system to the business letter domain T-Recs表识别系统在商务信函领域的应用
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953843
T. Kieninger, A. Dengel
{"title":"Applying the T-Recs table recognition system to the business letter domain","authors":"T. Kieninger, A. Dengel","doi":"10.1109/ICDAR.2001.953843","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953843","url":null,"abstract":"This paper summarizes the core idea of the T-Recs table recognition system, an integrated system covering block-segmentation, table location and a model-free structural analysis of tables. T-Recs works on the output of commercial OCR systems that provide the word bounding box geometry together with the text itself (e.g. Xerox ScanWorX). While T-Recs performs well on a number of document categories, business letters still remained a challenging domain because the T-Recs location heuristics are mislead by their header or footer resulting in a low recognition precision. Business letters such as invoices are a very interesting domain for industrial applications due to the large amount of documents to be analyzed and the importance of the data carried within their tables. Hence, we developed a more restrictive approach which is implemented in the T-Recs++ prototype. This paper describes the ideas of the T-Recs++ location and also proposes a quality evaluation measure that reflects the bottom-up strategy of either T-Recs or T-Recs++. Finally, some results comparing both systems on a collection of business letters are given.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"435 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126983837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 80
Substroke approach to HMM-based on-line Kanji handwriting recognition 基于hmm的在线汉字手写识别的子笔划方法
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953838
M. Nakai, N. Akira, H. Shimodaira, S. Sagayama
{"title":"Substroke approach to HMM-based on-line Kanji handwriting recognition","authors":"M. Nakai, N. Akira, H. Shimodaira, S. Sagayama","doi":"10.1109/ICDAR.2001.953838","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953838","url":null,"abstract":"A new method is proposed for online handwriting recognition of Kanji characters. The method employs substroke HMM as minimum units to constitute Japanese Kanji characters and utilizes the direction of pen motion. The main motivation is to fully utilize the continuous speech recognition algorithm by relating sentence speech to Kanji character phonemes to substrokes, and grammar to Kanji structure. The proposed system consists input feature analysis, substroke HMM, a character structure dictionary and a decoder. The present approach has the following advantages over the conventional methods that employ whole character HMM. 1) Much smaller memory requirement for dictionary and models. 2) Fast recognition by employing efficient substroke network search. 3) Capability of recognizing characters not included in the training data if defined as a sequence of substrokes in the dictionary. 4) Capability of recognizing characters written by various different stroke orders with multiple definitions per one character in the dictionary. 5) Easiness in HMM adaptation to the user with a few sample character data.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125109209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 105
Measuring HMM similarity with the Bayes probability of error and its application to online handwriting recognition 用贝叶斯误差概率度量HMM相似度及其在在线手写识别中的应用
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953822
Claus Bahlmann, H. Burkhardt
{"title":"Measuring HMM similarity with the Bayes probability of error and its application to online handwriting recognition","authors":"Claus Bahlmann, H. Burkhardt","doi":"10.1109/ICDAR.2001.953822","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953822","url":null,"abstract":"We propose a novel similarity measure for hidden Markov models (HMMs). This measure calculates the Bayes probability of error for HMM state correspondences and propagates it along the Viterbi path in a similar way to the HMM Viterbi scoring. It can be applied as a tool to interpret misclassifications, as a stop criterion in iterative HMM training or as a distance measure for HMM clustering. The similarity measure is evaluated in the context of online handwriting recognition on lower case character models which have been trained from the UNIPEN database. We compare the similarities with experimental classifications. The results show that similar and misclassified class pairs are highly correlated. The measure is not limited to handwriting recognition, but can be used in other applications that use HMM based methods.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125155429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Character pre-classification based on fuzzy typographical analysis 基于模糊排版分析的字符预分类
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953758
Lu Da, Pu Wei, B. McCane
{"title":"Character pre-classification based on fuzzy typographical analysis","authors":"Lu Da, Pu Wei, B. McCane","doi":"10.1109/ICDAR.2001.953758","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953758","url":null,"abstract":"This paper presents a new fuzzy-logic approach for character pre-classification which gives a precise way of calculating the baseline detection algorithm with tolerance analysis through analyzing the typographical structure of textual blocks. The other virtual reference lines are extracted from clustering techniques. In order to ensure character pre-classification correctly, a fuzzy-logic approach is used to assign a membership to each typographical category for ambiguous classes. The results prove that an improved character recognition rate can be achieved by means of typographical categorization. The fuzzy typographical analysis can correctly pre-classify characters and can efficiently process more than 10000 characters per second.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121755076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An improved learning scheme for the moving window classifier 一种改进的移动窗口分类器学习方案
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953861
Sanaul Hoque, M. Fairhurst
{"title":"An improved learning scheme for the moving window classifier","authors":"Sanaul Hoque, M. Fairhurst","doi":"10.1109/ICDAR.2001.953861","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953861","url":null,"abstract":"The moving window classifier (MWC) is a simple and efficient classifier structure which, although shown to be capable of promising performance in a variety of tasks such as face recognition, its common application is a tool in text recognition. Various measures have been proposed to improve the MWC classification speed and to reduce memory space requirement. This paper introduces techniques for improving the MWC classification accuracy without losing any of gains previously achieved. These performance enhancement schemes are readily applicable to a range of related classifiers and hence provide a generalized method for enhancement in a variety of tasks.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125065174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
AIDAS: incremental logical structure discovery in PDF documents AIDAS: PDF文档中的增量逻辑结构发现
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953816
A. Anjewierden
{"title":"AIDAS: incremental logical structure discovery in PDF documents","authors":"A. Anjewierden","doi":"10.1109/ICDAR.2001.953816","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953816","url":null,"abstract":"AIDAS is part of a research project in which the aim is to turn technical manuals into a database of indexed training material. We describe the approach AIDAS uses to extract the logical document structure from PDF documents. The approach is based on the idea that the layout structure contains cues about the logical structure and that the logical structure can be discovered incrementally.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125506162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
Handwritten country name identification using vector quantisation and hidden Markov model 使用矢量量化和隐马尔可夫模型的手写国家名称识别
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953877
G. Leedham, W. Tan, Weng Lee Yap
{"title":"Handwritten country name identification using vector quantisation and hidden Markov model","authors":"G. Leedham, W. Tan, Weng Lee Yap","doi":"10.1109/ICDAR.2001.953877","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953877","url":null,"abstract":"This paper is a study of keyword recognition using vector quantisation and a hidden Markov model. The purpose is to be able to identify a word holistically. This study considers the problem of identifying a handwritten country name from the 189 different country names registered with the Universal Postal Union. The method divides the words in the last line of the address image into 16/spl times/16 pixel blocks which are fed into a vector quantiser. The VQ outputs are classified using a HMM. Some presorting is carried out based on the letter-length of the word. The results on a set of 415 handwritten country names show the method is 85.3% correct with the majority of errors in estimating the letter-length of the word and distorted VQ output due to sloping and slanted words/letters.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133672462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Web sites thematic classification using hidden Markov models 使用隐马尔可夫模型的网站主题分类
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953955
Lyonel Serradura, M. Slimane, N. Vincent
{"title":"Web sites thematic classification using hidden Markov models","authors":"Lyonel Serradura, M. Slimane, N. Vincent","doi":"10.1109/ICDAR.2001.953955","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953955","url":null,"abstract":"There is more and more information available on the Internet. We need tools to help us extract the right piece of information. We have developed a classification algorithm tackling this issue in French. It distinguishes web pages classifying their text content into themes. We use Hidden Markov Models (HMM) to build this method named STCoL (Supervised Thematic Corpus Learning). Once themes are modeled with HMMs, STCoL is able to classify documents from different sources. This method is not only efficient but is also robust.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133687412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Newspaper page decomposition using a split and merge approach 使用拆分和合并方法分解报纸页面
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953972
K. Hadjar, O. Hitz, R. Ingold
{"title":"Newspaper page decomposition using a split and merge approach","authors":"K. Hadjar, O. Hitz, R. Ingold","doi":"10.1109/ICDAR.2001.953972","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953972","url":null,"abstract":"Indexing large newspaper archives requires automatic page decomposition algorithms with high accuracy. In this paper, we present our approach to an automatic page decomposition algorithm developed for the First International Newspaper Segmentation Contest. Our approach decomposes the newspaper image into image regions, horizontal and vertical lines, text regions and title areas. Experimental results are obtained from the data set of the contest.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132356663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信