Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)最新文献

筛选
英文 中文
Three decision levels strategy for Arabic and Latin texts differentiation in printed and handwritten natures 印刷和手写性质的阿拉伯语和拉丁语文本区分的三个决策层次策略
M. B. Jlaiel, S. Kanoun, A. Alimi, R. Mullot
{"title":"Three decision levels strategy for Arabic and Latin texts differentiation in printed and handwritten natures","authors":"M. B. Jlaiel, S. Kanoun, A. Alimi, R. Mullot","doi":"10.1109/ICDAR.2007.250","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.250","url":null,"abstract":"Arabic and Latin script identification in printed and handwritten nature present several difficulties because the Arabic (printed or handwritten) and the handwritten Latin scripts are cursive scripts of nature. To avoid all possible confusions which can be generated, we propose in this paper a strategy which is based on three decision levels where each level will have its own features vector and will consist in identifying only one script among the scripts to identify.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125380476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Middle Zone Component Extraction and Recognition of Telugu Document Image 泰卢固语文档图像中间区分量提取与识别
L. Reddy, L. Satyaprasad, A. Sastry
{"title":"Middle Zone Component Extraction and Recognition of Telugu Document Image","authors":"L. Reddy, L. Satyaprasad, A. Sastry","doi":"10.1109/ICDAR.2007.169","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.169","url":null,"abstract":"Telugu is one of the ancient languages of South India. It has a complex orthography with a large number of distinct character shapes composed of simple and compound characters. The work reported in literature till the recent period is based on the connected component approach. Less attention is observed on the generalized character model and its application in the OCR development. Script syllable follows canonical structure where a consonant vowel core is preceded by one or two optional consonants .Formation of a syllable posses unique structural nature. In the present work, structural features of the syllable and the component model are combined to extract middle zone components. The shape of the middle zone components is closely related to a circle whereas other components are found with different topological features. Recognition rate of 99 percent is observed with the proposed method.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126901299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Knowledge-Based Recognition of Utility Map Sub-Diagrams 基于知识的实用地图子图识别
S. Hickinbotham, A. Cohn
{"title":"Knowledge-Based Recognition of Utility Map Sub-Diagrams","authors":"S. Hickinbotham, A. Cohn","doi":"10.1109/ICDAR.2007.152","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.152","url":null,"abstract":"An integrated map of all utility services in a locale would facilitate better management of the road infrastructure and the utilities themselves. To meet this goal, there exists a need to integrate raster scans of paper maps into GIS by capturing the semantic relationships between the objects in the drawings. In this context, commercially available vectorisation algorithms do not produce a sufficiently rich object representation. We present a structural object recognition system that successfully isolates sectional sub- diagrams in maps of underground utilities. This is built upon a vectorisation system based on a constrained Delau-nay triangulation of pen strokes.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"152 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127314354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Data Mining Approach to Reading Order Detection 一种基于数据挖掘的阅读顺序检测方法
Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.4377050
Michelangelo Ceci, Margherita Berardi, G. Porcelli, D. Malerba
{"title":"A Data Mining Approach to Reading Order Detection","authors":"Michelangelo Ceci, Margherita Berardi, G. Porcelli, D. Malerba","doi":"10.1109/ICDAR.2007.4377050","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.4377050","url":null,"abstract":"Determining the reading order for layout components extracted from a document image can be a crucial problem for several applications. It enables the reconstruction of a single textual element from texts associated to multiple layout components and makes both information extraction and content-based retrieval of documents more effective. A common aspect for all methods reported in the literature is that they strongly depend on the specific domain and are scarcely reusable when the classes of documents or the task at hand changes. In this paper, we investigate the problem of detecting the reading order of layout components by resorting to a data mining approach which acquires the domain specific knowledge from a set of training examples. The input of the learning method is the description of the \"chains\" of layout components defined by the user. Only spatial information is exploited to describe a chain, thus making the proposed approach also applicable to the cases in which no text can be associated to a layout component. The method induces a probabilistic classifier based on the Bayesian framework which is used for reconstructing either single or multiple chains of layout components. It has been evaluated on a set of document images.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125183809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Document Images Retrieval Based on Multiple Features Combination 基于多特征组合的文档图像检索
Gaofeng Meng, N. Zheng, Yonghong Song, Yuanlin Zhang
{"title":"Document Images Retrieval Based on Multiple Features Combination","authors":"Gaofeng Meng, N. Zheng, Yonghong Song, Yuanlin Zhang","doi":"10.1109/ICDAR.2007.103","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.103","url":null,"abstract":"Retrieving the relevant document images from a great number of digitized pages with different kinds of artificial variations and documents quality deteriorations caused by scanning and printing is a meaningful and challenging problem. We attempt to deal with this problem by combining up multiple different kinds of document features in a hybrid way. Firstly, two new kinds of document image features based on the projection histograms and crossings number histograms of an image are proposed. Secondly, the proposed two features, together with density distribution feature and local binary pattern feature, are combined in a multistage structure to develop a novel document image retrieval system. Experimental results show that the proposed novel system is very efficient and robust for retrieving different kinds of document images, even if some of them are severely degraded.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121877801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Retrieval of Handwritten Lines in Historical Documents 历史文献中手写行检索
Lambert Schomaker
{"title":"Retrieval of Handwritten Lines in Historical Documents","authors":"Lambert Schomaker","doi":"10.1109/ICDAR.2007.219","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.219","url":null,"abstract":"This study describes methods for the retrieval of handwritten lines of text in a historical administrative collection. The goal is to develop generic methods for bootstrapping the retrieval system from a tabula rasa starting condition, i.e., the virtual absence of labeled samples. By exploiting the currently available computing power and the fact that computation takes place off line, it should be possible to provide a good starting point for statistical learning methods. In this manner, a closed collection can be incrementally indexed. A cross-correlation method on line-strip images is presented and results are compared to feature-based methods.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123124590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
A Case-Based Reasoning Approach for Invoice Structure Extraction 基于案例的发票结构提取推理方法
Hatem Hamza, Y. Belaïd, A. Belaïd
{"title":"A Case-Based Reasoning Approach for Invoice Structure Extraction","authors":"Hatem Hamza, Y. Belaïd, A. Belaïd","doi":"10.1109/ICDAR.2007.3","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.3","url":null,"abstract":"This paper shows the use of case-based reasoning (CBR) for invoice structure extraction and analysis. This method, called CBR-DIA (CBR for document invoice analysis), is adaptive and does not need any previous training. It analyses a document by retrieving and analysing similar documents or elements of documents (cases) stored in a database. The retrieval step is performed thanks to graph comparison techniques like graph probing and edit distance. The analysis step is done thanks to the information found in the nearest retrieved cases. Applied on 950 invoices, CBR-DIA reaches a recognition rate of 85.29% for documents of known classes and 76.33% for documents of unknown classes.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129868197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
HMM-Based Recognizer with Segmentation-free Strategy for Unconstrained Chinese Handwritten Text 基于hmm的无分割中文手写文本识别方法
Tong-Hua Su, Tian-Wen Zhang, Hu-Jie Huang, Yu Zhou
{"title":"HMM-Based Recognizer with Segmentation-free Strategy for Unconstrained Chinese Handwritten Text","authors":"Tong-Hua Su, Tian-Wen Zhang, Hu-Jie Huang, Yu Zhou","doi":"10.1109/ICDAR.2007.133","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.133","url":null,"abstract":"A segmentation-free strategy based on hidden Markov models (HMMs) is presented for offline recognition of unconstrained Chinese handwriting. As the first step, handwritten textlines are converted to observation sequence by sliding windows and character segmentation stage is avoided prior to recognition. Following that, embedded Baum-Welch algorithm is adopted to train character HMMs. Finally, best character string maximizing the a posteriori is located through Viterbi algorithm. Experiments are conducted on the HIT-MW database written by more than 780 writers. The results show: First, our baseline recognizer outperforms one segmentation-based OCR product with 35% relative improvement; second, more discriminative feature and compact representation, and state-tying technique to alleviate the data sparsity can enhance the recognizer with high confidence. The final recognizer has improved the performance by 10.77% than the baseline system.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132866144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Fusing Asynchronous Feature Streams for On-line Writer Identification 融合异步特征流的在线写入器识别
A. Schlapbach, H. Bunke
{"title":"Fusing Asynchronous Feature Streams for On-line Writer Identification","authors":"A. Schlapbach, H. Bunke","doi":"10.1109/ICDAR.2007.122","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.122","url":null,"abstract":"In this paper, we present a new approach to improving the performance of a writer identification system by fusing asynchronous feature streams. Different feature streams are extracted from on-line handwritten text acquired from a whiteboard. The feature streams are used to train a text and language independent writer identification system based on Gaussian mixture models (GMMs). From a stroke consisting of n points, n point-based feature vectors and one stroke-based feature vector are extracted. The resulting feature streams thus have an unequal number of feature vectors. We evaluate different methods to directly fuse the feature streams and show that, by means of feature fusion, we can improve the performance of the writer identification system on a data set produced by 200 different writers.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134380976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Synthesis of Chinese Character Using Affine Transformation 用仿射变换合成汉字
Lianwen Jin, XiaoNa Zu
{"title":"Synthesis of Chinese Character Using Affine Transformation","authors":"Lianwen Jin, XiaoNa Zu","doi":"10.1109/ICDAR.2007.239","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.239","url":null,"abstract":"We present a novel Chinese synthesis method based on affine transform. A set of basic Chinese character element (BCCE) are designed, which can be used to generate any Chinese character in standard GB2312-80 level 1. Structure similarity measurement is used to evaluate the synthesis quality. Experiments showed that the synthesized characters look smooth and natural. Storage of synthesized characters can be greatly reduced. The proposed Chinese character synthesis method has many potential applications, such as building small-size Chinese font, building compact classifier for Chinese OCR, and etc.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132048617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信