Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.最新文献

筛选
英文 中文
Text-mining based journal splitting 基于文本挖掘的日志分割
Xiaofan Lin
{"title":"Text-mining based journal splitting","authors":"Xiaofan Lin","doi":"10.1109/ICDAR.2003.1227822","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227822","url":null,"abstract":"This paper introduces a novel journal splittingalgorithm. It takes full advantage of various kinds ofinformation such as text match, layout and page numbers.The core procedure is a highly efficient text-miningalgorithm, which detects the matched phrases between thecontent pages and the title pages of individual articles.Experiments show that this algorithm is robust and ableto split a wide range of journals, magazines and books.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116961984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Image segmentation by learning approach 基于学习的图像分割方法
H. Legal-Ayala, J. Facon
{"title":"Image segmentation by learning approach","authors":"H. Legal-Ayala, J. Facon","doi":"10.1109/ICDAR.2003.1227776","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227776","url":null,"abstract":"This article describes a new segmentation bythresholding approach based on learning. The methodconsists in learning to threshold correctly submitting bothan image and its ideal thresholded version. From thisstage it is generated a decision matrix for each pixel andeach gray level that is re-utilized at the moment of thenew images segmentation. The new image is thresholdedby means of a new strategy based on the nearestneighbors, that seeks, for each pixel of this new image,the best solution in the decision matrix. Performed testson handwritten documents showed promising results. Interms of quality of the results, the developed technique isequal or superior to the traditional segmentation bythresholding techniques, with the advantage that the onediscussed here does not requires the use of heuristicparameters.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117259280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Writer identification based on the fractal construction of a reference base 基于分形构造的作家识别参考库
A. Seropian, M. Grimaldi, N. Vincent
{"title":"Writer identification based on the fractal construction of a reference base","authors":"A. Seropian, M. Grimaldi, N. Vincent","doi":"10.1109/ICDAR.2003.1227840","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227840","url":null,"abstract":"Our aim is to achieve writer identification processthanks to a fractal analysis of handwriting style. For eachwriter, a set of characteristics is extracted. They arespecific to the writer. Advantage is taken from theautosimilarity properties that are present in one'shandwriting. In order to do that, some invariant patternscharacterizing the writing are extracted. During thetraining step these invariant patterns appear along afractal compression process, then they are organized in areference base that can be associated with the writer.This base allows to analyze an unknown writing thewriter of which has to be identified. A Pattern Matchingprocess is performed using all the reference basessuccessively. The results of this analyze are estimatedthrough the signal to noise ratio. Thus, the signal to noiseratio according to a set of bases identifies the unknowntext's writer.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"208 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115903434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
An architecture for ink annotations on Web documents Web文档上墨水注释的体系结构
Sriram Ramachandran, R. Kashi
{"title":"An architecture for ink annotations on Web documents","authors":"Sriram Ramachandran, R. Kashi","doi":"10.1109/ICDAR.2003.1227669","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227669","url":null,"abstract":"There have been recent improvements in document technologies like the standardization of object interfaces to access and manipulate the properties of Web documents. There has also been significant progress in pen based computing for recognition of digital ink in desktops, tablets and handheld devices. These have necessitated a need for further research on annotation architectures for digital documents, specifically pen-based annotation systems. This paper presents an attempt to leverage the new standards of DHTML and W3C DOM that are being gradually implemented by popular browsers, to build a prototype of an ink annotation system with common components across browsers. One of the primary goals in this study is to semantically link ink data with underlying document elements like text and images. The system has three components: a) ink capture and rendering b) Ink Understanding, which recognizes and associates ink with the underlying document; and c) Ink storage and retrieval.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115432751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Text identification in noisy document images using Markov random model 基于马尔可夫随机模型的噪声文档图像文本识别
Yefeng Zheng, Huiping Li, D. Doermann
{"title":"Text identification in noisy document images using Markov random model","authors":"Yefeng Zheng, Huiping Li, D. Doermann","doi":"10.1109/ICDAR.2003.1227734","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227734","url":null,"abstract":"In this paper we address the problem of the identification of text from noisy documents. We segment and identify handwriting from machine printed text because 1) handwriting in a document often indicates corrections, additions or other supplemental information that should be treated differently from the main body or body content, and 2) the segmentation and recognition techniques for machine printed text and handwriting are significantly different. Our novelty is that we treat noise as a separate class and model noise based on selected features. Trained Fisher classifiers are used to identify machine printed text and handwriting from noise. We further exploit context to refine the classification. A Markov random field (MRF) based approach is used to model the geometrical structure of the printed text, handwriting and noise to rectify the mis-classification. Experimental results show our approach is promising and robust, and can significantly improve the page segmentation results in noise documents.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121871123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Detection of text marks on moving vehicles 移动车辆上的文本标记检测
R. Kasturi
{"title":"Detection of text marks on moving vehicles","authors":"R. Kasturi","doi":"10.1109/ICDAR.2003.1227696","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227696","url":null,"abstract":"Vehicle text marks are unique features which are useful for identifying vehicles in video surveillance applications. We propose a method for finding such text marks. An existing text detection algorithm is modified such that detection is increased and made more robust to outdoor conditions. False alarm is reduced by introducing a binary image test which remove detections that are not likely to be caused by text. The method is tested on a captured video of a typical street scene.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"250 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121880873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Individuality of numerals 数字的个性
S. Srihari, C. Tomai, Bin Zhang, Sangjik Lee
{"title":"Individuality of numerals","authors":"S. Srihari, C. Tomai, Bin Zhang, Sangjik Lee","doi":"10.1109/ICDAR.2003.1227826","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227826","url":null,"abstract":"The analysis of handwritten documents from the view-pointof determining their writership has great bearing onthe criminal justice system. In many cases, only a limitedamount of handwriting is available and sometimes it consistsof only numerals. Using a large number of handwrittennumeral images extracted from about 3000 samples writtenby 1000 writers, a study of the individuality of numerals foridentification/verification purposes was conducted. The individualityof numerals was studied using cluster analysis.Numerals discriminability was measured for writer verification.The study shows that some numerals present a higherdiscriminatory power and that their performances for theverification/identification tasks are very different.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129982815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
A scalable solution for integrating illustrated parts drawings into a Class IV Interactive Electronic Technical Manual 一个可扩展的解决方案,用于将插图零件图集成到IV类交互式电子技术手册中
Molly L. Boose, D. B. Shema, Lawrence S. Baum
{"title":"A scalable solution for integrating illustrated parts drawings into a Class IV Interactive Electronic Technical Manual","authors":"Molly L. Boose, D. B. Shema, Lawrence S. Baum","doi":"10.1109/ICDAR.2003.1227679","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227679","url":null,"abstract":"This paper discusses a scalable solution for integrating legacy illustrated parts drawings into a Class IV Interactive Electronic Technical Manual (IETM) (1995). An IETM is an interactive electronic version of a system's technical manuals such as for a commercial airplane or a military helicopter. It contains the information a technician needs to do her job including troubleshooting, vehicle maintenance and repair procedures. A Class IV IETM is an IETM that is authored and managed directly via a database. The end-user system optimizes viewing and navigation, minimizing the need for users to browse and search through large volumes of data. The Boeing Company has hundreds of thousands of illustrated parts drawings for both commercial and military vehicles. As Boeing migrates to Class IV IETM systems, it is necessary to incorporate existing illustrated parts drawings into the new systems. Manually re-authoring the drawings to bring them up to the level of a Class IV IETM is prohibitively expensive. Our solution is to provide a batch-processing system that performs the required modifications to the raster images and automatically updates the IETM database.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129323292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
SAGENT: a novel technique for document modeling for secure access and distribution SAGENT:一种用于安全访问和分发的文档建模的新技术
Sanaul Hoque, H. Selim, G. Howells, M. Fairhurst, F. Deravi
{"title":"SAGENT: a novel technique for document modeling for secure access and distribution","authors":"Sanaul Hoque, H. Selim, G. Howells, M. Fairhurst, F. Deravi","doi":"10.1109/ICDAR.2003.1227859","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227859","url":null,"abstract":"A novel strategy for the representation and manipulationof distributed documents, potentially complex andheterogeneous, is presented in this paper. The documentunder the proposed model is represented in a hierarchicalstructure. Associated metadata' describes the flexiblehierarchy with the scope of dynamically restructuring thetree at runtime. All useful functionals can also be includedwithin the hierarchy to minimize reliance on externalprograms in manipulating sensitive data. Thisgives the proposed model two key properties: generality(capable of representing any document format includingfuture innovations) and autonomy (non-reliance on externalprograms). The model also allows incorporation ofadditional features for security and access control. Biometricperson authentication measures are introduced. Abrief example illustrates the key ideas.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129488369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Classification of Web documents using a graph model 使用图模型对Web文档进行分类
A. Schenker, Mark Last, H. Bunke, A. Kandel
{"title":"Classification of Web documents using a graph model","authors":"A. Schenker, Mark Last, H. Bunke, A. Kandel","doi":"10.1109/ICDAR.2003.1227666","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227666","url":null,"abstract":"In this paper we describe work relating to classification of Web documents using a graph-based model instead of the traditional vector-based model for document representation. We compare the classification accuracy of the vector model approach using the k-nearest neighbor (k-NN) algorithm to a novel approach which allows the use of graphs for document representation in the k-NN algorithm. The proposed method is evaluated on three different Web document collections using the leave-one-out approach for measuring classification accuracy. The results show that the graph-based k-NN approach can outperform traditional vector-based k-NN methods in terms of both accuracy and execution time.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"2009 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128232333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 102
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信