{"title":"A Laplacian Method for Video Text Detection","authors":"T. Phan, P. Shivakumara, C. Tan","doi":"10.1109/ICDAR.2009.153","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.153","url":null,"abstract":"In this paper, we propose an efficient text detection method based on the Laplacian operator. The maximum gradient difference value is computed for each pixel in the Laplacian-filtered image. K-means is then used to classify all the pixels into two clusters: text and non-text. For each candidate text region, the corresponding region in the Sobel edge map of the input image undergoes projection profile analysis to determine the boundary of the text blocks. Finally, we employ empirical rules to eliminate false positives based on geometrical properties. Experimental results show that the proposed method is able to detect text of different fonts, contrast and backgrounds. Moreover, it outperforms three existing methods in terms of detection and false positive rates.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127152707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Devanagari and Bangla Text Extraction from Natural Scene Images","authors":"U. Bhattacharya, S. K. Parui, Srikanta Mondal","doi":"10.1109/ICDAR.2009.178","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.178","url":null,"abstract":"With the increasing popularity of digital cameras attached with various handheld devices, many new computational challenges have gained significance. One such problem is extraction of texts from natural scene images captured by such devices. The extracted text can be sent to OCR or to a text-to-speech engine for recognition. In this article, we propose a novel and effective scheme based on analysis of connected components for extraction of Devanagari and Bangla texts from camera captured scene images. A common unique feature of these two scripts is the presence of headline and the proposed scheme uses mathematical morphology operations for their extraction. Additionally, we consider a few criteria for robust filtering of text components from such scene images. Moreover, we studied the problem of binarization of such scene images and observed that there are situations when repeated binarization by a well-known global thresholding approach is effective. We tested our algorithm on a repository of 100 scene images containing texts of Devanagari and / or Bangla.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127340784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Chowdhury, Soumyadeep Dhar, A. Das, B. Chanda, K. McMenemy
{"title":"Robust Extraction of Text from Camera Images","authors":"S. Chowdhury, Soumyadeep Dhar, A. Das, B. Chanda, K. McMenemy","doi":"10.1109/ICDAR.2009.188","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.188","url":null,"abstract":"Text within a camera grabbed image can contain a huge amount of meta data about that scene. Such meta data can be useful for identification, indexing and retrieval purposes.Detection of colored scene text is a new challenge for all camera based images.Common problems for text extraction from camera based images are the lack of prior knowledge of any kind of text features such as color, font, size and orientation.In this paper we propose a new algorithm for the extraction of text from an image which can overcome these problems. In addition, problems due to an unconstrained complex background in the scene has also been addressed.Here a new technique is applied to determine the discrete edges around the text boundaries. A novel methodology is also proposed to extract the text exploiting its appearance in terms of color and spatial distribution.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123720824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pingping Xiu, D. Lopresti, H. Baird, G. Nagy, E. B. Smith
{"title":"Style-Based Ballot Mark Recognition","authors":"Pingping Xiu, D. Lopresti, H. Baird, G. Nagy, E. B. Smith","doi":"10.1109/ICDAR.2009.273","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.273","url":null,"abstract":"The push toward voting via hand-marked paper ballots has focused attention on the limitations of current optical scan systems. Discrepancies between human and machine interpretations of ballot markings can lead to a loss of trust in the election process. In this paper, a style-based approach to ballot recognition is proposed in which marks are recognized collectively rather than in isolation. The consistency of a voter's style is leveraged to improve the overall accuracy of the system. We compare style-based recognition to various kinds of singlet classifiers and show that it outperforms them by a substantial margin.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"22 45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130534577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"New Trends in Digital Scanning Processes","authors":"S. Impedovo, R. Modugno, A. Ferrante, E. Stasolla","doi":"10.1109/ICDAR.2009.76","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.76","url":null,"abstract":"Handwritten document analysis and recognition deals with several different application fields. In document processing, one of the first problems that must be solved is data acquisition. The selection of the appropriate acquisition device plays a fundamental role; it depends on the different types of source documents and on the different application domains. Digital scanners allow both massive document acquisition and conversion of paper based documents into electronic documents. Depending on the environment requirements, the original quality of the analog signal must be appropriately captured and preserved in the digital conversion and the right scanner need to be selected. This paper presents an overview of the main characteristics of the scanners today available on the market and highlights their main properties. By considering the operation difficulties of actual scanners and the properties of the new plastic materials, this paper presents some ideas for future possible developments in digital scanning.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127625677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Approach for Skew Correction of Documents Based on Particle Swarm Optimization","authors":"J. Sadri, M. Cheriet","doi":"10.1109/ICDAR.2009.268","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.268","url":null,"abstract":"This paper presents a novel approach for skew correction of documents. Skew correction is modeled as an optimization problem, and for the first time, Particle Swarm Optimization (PSO) is used to solve skew optimization. Anew objective function based on local minima and maxima of projection profiles is defined, and PSO is utilized to find the best angle that maximizes differences between values of local minima and maxima. In our approach, local minima and maxima converge to the locations of lines and spaces between lines. Results of our skew correction algorithm are shown on documents written in different scripts such as Latin and Arabic related scripts (e.g. Arabic, Farsi,Urdu,...). Experiments show that our algorithm can handle a wide range of skew angles, also it is robust to gray level and binary images of different scripts.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"56 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131671920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extraction of Nom Text Regions from Stele Images Using Area Voronoi Diagram","authors":"Thai V. Hoang, S. Tabbone, N. Pham","doi":"10.1109/ICDAR.2009.13","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.13","url":null,"abstract":"Automatic processing of images of steles is a challenging problem due to the variation in their structures and body text characteristics. In this paper, area Voronoi diagram is used to represent the neighborhood of connected components in stele images containing Nom characters. Body text region is then extracted from stele images by the selection of appropriate adjacent Voronoi regions based on the information about the thickness of neighboring connected components. Experimental results show that the proposed method is highly accurate and robust to various types of stele.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"2017 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126633752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mathematical Symbol Indexing Using Topologically Ordered Clusters of Shape Contexts","authors":"S. Marinai, Beatrice Miotti, G. Soda","doi":"10.1109/ICDAR.2009.120","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.120","url":null,"abstract":"This paper addresses the indexing and retrieval of mathematical symbols from digitized documents. The proposed approach exploits Shape Contexts (SC) to describe the shape of mathematical symbols. Starting from the vector space method, that is based on SC clustering, we explore the use of topological ordered clusters to improve the retrieval performance. The clustering is computed by means of Self-Organizing Maps that organize the clusters in two dimensional topologically ordered feature maps. The retrieval performance are compared with those obtained using the K-means clustering on a large collection of mathematical symbols gathered from the widely used INFTY database.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126636620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Segmentation of Arabic Handwriting Based on both Contour and Skeleton Segmentation","authors":"S. Wshah, Zhixin Shi, V. Govindaraju","doi":"10.1109/ICDAR.2009.152","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.152","url":null,"abstract":"We propose a new algorithm for segmentation of off-line handwritten Arabic words. The algorithm segments the connected letters to smaller segments each of which contains no more than three letters. Each letter may be segmented to at most five pieces. In addition to improving the recognition of Arabic words, another potential application of the proposed segmentation method is to build lexicon of small size, consisting of no more than three letter combinations. Generally, it is very hard to generate lexicon for recognition of unconstraint handwritten Arabic documents due to the large number of words of Arabic language.The algorithm has been tested on over 6300 words from 45 different documents written by 18 writers. The system is able to segment more than 93% of the words into segments, each containing at most one letter, 6% of the words into segments that contains two letters and 3% of the words into segments that contains three letters.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126871018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Document Binarization Based on Connected Operators","authors":"B. Naegel, L. Wendling","doi":"10.1109/ICDAR.2009.42","DOIUrl":"https://doi.org/10.1109/ICDAR.2009.42","url":null,"abstract":"An original binarization method based on connected operators is proposed in this paper. Connected operators enable to filter and/or segment an image by preserving its contours.The proposed binarization method enables to extract relevant document objects by means of the component-tree structure. This method was compared to other binarization methods and showed good behavior in various contexts.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"812 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123325440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}