Proceedings of Sixth International Conference on Document Analysis and Recognition最新文献

筛选
英文 中文
Handwritten numeral recognition using flexible matching based on learning of stroke statistics 基于笔画统计学习的灵活匹配手写数字识别
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953862
Takashi Kobayashi, Kaori Nakamura, Hirokazu Muramatsu, Takahiro Sugiyama, K. Abe
{"title":"Handwritten numeral recognition using flexible matching based on learning of stroke statistics","authors":"Takashi Kobayashi, Kaori Nakamura, Hirokazu Muramatsu, Takahiro Sugiyama, K. Abe","doi":"10.1109/ICDAR.2001.953862","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953862","url":null,"abstract":"The purpose of this study is to learn shapes and structures of a given learning set of handwritten numerals and to develop a flexible matching method for recognition based on the learning. First, this paper proposes a method of how to obtain a set of standard character patterns and the ranges of variations varying statistically from the given learning character samples. Then the recognition is made as follows: each standard pattern is deformed to match with the input character; and the matching is evaluated by the energy of deformation; and the closeness of the standard pattern to the input.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128170920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Adaptive N-best-list handwritten word recognition 自适应n最佳列表手写单词识别
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953777
T. Kwok, M. Perrone
{"title":"Adaptive N-best-list handwritten word recognition","authors":"T. Kwok, M. Perrone","doi":"10.1109/ICDAR.2001.953777","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953777","url":null,"abstract":"We investigate a novel method for adaptively improving the machine recognition of handwritten words by applying a k-nearest neighbor (k-NN) classifier to the N-best word-hypothesis lists generated by a writer-independent hidden Markov model (HMM). Each new N-best list from the HMM is compared to the N-best lists in the k-NN classifier. A decision module is used to select between the output of the HMM and the matches found by the k-NN classifier. The N-best list chosen by the decision module can be automatically added to the k-NN classifier if it is not already in the k-NN classifier. This dynamic update of the k-NN classifier enables the system to adapt to new data without retraining. On a writer-independent set of 1158 handwritten words, this method reduces the error rate by approximately 30%. This method is fast and memory-efficient, and lends itself to many interesting generalizations.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125454021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Text extraction from color documents-clustering approaches in three and four dimensions 彩色文档的文本提取——三维和四维聚类方法
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953923
T. Perroud, K. Sobottka, H. Bunke, L. Hall
{"title":"Text extraction from color documents-clustering approaches in three and four dimensions","authors":"T. Perroud, K. Sobottka, H. Bunke, L. Hall","doi":"10.1109/ICDAR.2001.953923","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953923","url":null,"abstract":"Colored paper documents often contain important text information. For automating the retrieval process, identification of text elements is essential. In order to reduce the number of colors in a scanned document, color clustering is usually done first. In this article two histogram-based color clustering algorithms are investigated. The first is based on the RGB color space exclusively, while the second takes spatial information into account, in addition to the colors. Experimental results have shown that the use of spatial information in the clustering algorithm has a positive impact. Thus the automatic retrieval of text information can be improved. The proposed methods for clustering are not restricted to document images. They can also be used for processing Web or video images, for example.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"01 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121623493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Robust feature extraction based on run-length compensation for degraded handwritten character recognition 基于游程补偿的退化手写字符识别鲁棒特征提取
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953870
M. Mori, M. Sawaki, N. Hagita, H. Murase, N. Mukawa
{"title":"Robust feature extraction based on run-length compensation for degraded handwritten character recognition","authors":"M. Mori, M. Sawaki, N. Hagita, H. Murase, N. Mukawa","doi":"10.1109/ICDAR.2001.953870","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953870","url":null,"abstract":"Conventional features are robust for recognizing either deformed or degraded characters. This paper proposes a feature extraction method that is robust for both of them. Run-length compensation is introduced for extracting approximate directional run-lengths of strokes from degraded handwritten characters. This technique is applied to the conventional feature vector based on directional run-lengths. Experiments for handwritten characters with additive or subtractive noise show that the proposed feature is superior to conventional ones over a wide range of the degree of noise.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132045130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Text line segmentation and word recognition in a system for general writer independent handwriting recognition 文本行分割和词识别系统中一般写作者独立的手写识别
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953775
Urs-Viktor Marti, H. Bunke
{"title":"Text line segmentation and word recognition in a system for general writer independent handwriting recognition","authors":"Urs-Viktor Marti, H. Bunke","doi":"10.1109/ICDAR.2001.953775","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953775","url":null,"abstract":"We present a system for recognizing unconstrained English handwritten text based on a large vocabulary. We describe the three main components of the system, which are preprocessing, feature extraction and recognition. In the preprocessing phase the handwritten texts are first segmented into lines. Then each line of text is normalized with respect to of skew, slant, vertical position and width. After these steps, text lines are segmented into single words. For this purpose distances between connected components are measured. Using a threshold, the distances are divided into distances within a word and distances between different words. A line of text is segmented at positions where the distances are larger than the chosen threshold. From each image representing a single word, a sequence of features is extracted. These features are input to a recognition procedure which is based on hidden Markov models. To investigate the stability of the segmentation algorithm the threshold that separates intra- and inter-word distances from each other is varied. If the threshold is small many errors are caused by over-segmentation, while for large thresholds under-segmentation errors occur. The best segmentation performance is 95.56% correctly segmented words, tested on 541 text lines containing 3899 words. Given a correct segmentation rate of 95.56%, a recognition rate of 73.45% on the word level is achieved.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132367340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 95
An OCR system for Telugu 泰卢固语的OCR系统
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953958
A. Negi, C. Bhagvati, B. Krishna
{"title":"An OCR system for Telugu","authors":"A. Negi, C. Bhagvati, B. Krishna","doi":"10.1109/ICDAR.2001.953958","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953958","url":null,"abstract":"Telugu is the language spoken by more than 100 million people of South India. Telugu has a complex orthography with a large number of distinct character shapes (estimated to be of the order of 10,000) composed of simple and compound characters formed from 16 vowels (called achchus) and 36 consonants (called hallus). We present an efficient and practical approach to Telugu OCR which limits the number of templates to be recognized to just 370, avoiding issues of classifier design for thousands of shapes or very complex glyph segmentation. A compositional approach using connected components and fringe distance template matching was tested to give a raw OCR accuracy of about 92%. Several experiments across varying fonts and resolutions showed the approach to be satisfactory.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"171 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132530670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 143
Discrimination of Oriental and Euramerican scripts using fractal feature 基于分形特征的东西方文字辨析
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953959
Yu Tao, Y. Tang
{"title":"Discrimination of Oriental and Euramerican scripts using fractal feature","authors":"Yu Tao, Y. Tang","doi":"10.1109/ICDAR.2001.953959","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953959","url":null,"abstract":"This paper presents a new approach based on modified fractal signatures (MFS) and modified fractal features (MFF)for the discrimination of Oriental and Euramerican scripts. These methods will be useful in the measurement and classification of patterns. MFS do not need iterative breaking or merging, and can divide a document into blocks in a single step. MFF is also used in the identification and classification of a selected set of texture images with good results. It is anticipated that this approach could be widely used to process various types of documents, even including some with high geometrical complexity.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129993465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Creating generic text summaries 创建通用文本摘要
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953917
Yihong Gong, Xin Liu
{"title":"Creating generic text summaries","authors":"Yihong Gong, Xin Liu","doi":"10.1109/ICDAR.2001.953917","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953917","url":null,"abstract":"We propose two generic text summarization methods that create text summaries by ranking and extracting sentences from the original documents. The first method uses standard information retrieval methods to rank sentence relevances, while the second method uses the latent semantic analysis technique to identify semantically important sentences, for summary creations. Both methods strive to select sentences that are highly ranked and different from each other. This is an attempt to create a summary with a wider coverage of the document's main content and less redundancy. Performance evaluations on the two summarization methods are conducted by comparing their summarization outputs with the manual summaries generated by three independent human evaluators.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130111131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
A class-modularity for character recognition 用于字符识别的类模块化
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953756
Il-Seok Oh, Jin-Seon Lee, C. Suen
{"title":"A class-modularity for character recognition","authors":"Il-Seok Oh, Jin-Seon Lee, C. Suen","doi":"10.1109/ICDAR.2001.953756","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953756","url":null,"abstract":"A class-modular classifier can be characterized by two prominent features: low classifier complexity and independence of classes. While conventional character recognition systems adopting the class modularity are faithful to the first feature, they do not investigate the second one. Since a class can be handled independently of the other classes, the class-specific feature set and classifier architecture can be optimally designed for a specific class Here we propose a general framework for the class modularity that exploits fully both features and present four types of class-modular architecture. The neural network classifier is used for testing the framework A simultaneous selection of the feature set and network architecture is performed by the genetic algorithm. The effectiveness of the class-specific features and classifier architectures is confirmed by experimental results on the recognition of handwritten numerals.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130471240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Word discrimination based on bigram co-occurrences 基于重字共现的词辨别
Proceedings of Sixth International Conference on Document Analysis and Recognition Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953773
A. El-Nasan, S. Veeramachaneni, G. Nagy
{"title":"Word discrimination based on bigram co-occurrences","authors":"A. El-Nasan, S. Veeramachaneni, G. Nagy","doi":"10.1109/ICDAR.2001.953773","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953773","url":null,"abstract":"Very few pairs of English words share exactly the same letter bigrams. This linguistic property can be exploited to bring lexical context into the classification stage of a word recognition system. The lexical n-gram matches between every word in a lexicon and a subset of reference words can be precomputed. If a match function can detect matching segments of at least n-gram length from the feature representation of words, then an unknown word can be recognized by determining the subset of reference words having an n-gram match at the feature level with the unknown word. We show that with a reasonable number of reference words, bigrams represent the best compromise between the recall ability of single letters and the precision of trigrams. Our simulations indicate that using a longer reference list can compensate errors in feature extraction. The algorithm is fast enough, even with a slow processor, for human-computer interaction.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133317152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信