DAR '12Pub Date : 2012-12-16DOI: 10.1145/2432553.2432559
Rajneesh Rani, R. Dhir, Gurpreet Singh Lehal
{"title":"Performance analysis of feature extractors and classifiers for script recognition of English and Gurmukhi words","authors":"Rajneesh Rani, R. Dhir, Gurpreet Singh Lehal","doi":"10.1145/2432553.2432559","DOIUrl":"https://doi.org/10.1145/2432553.2432559","url":null,"abstract":"Script Recognition is a challenging field for the recognition of documents in a multilingual country like India where different scripts are in use. For optical character recognition of such multilingual documents, it is necessary to separate blocks, lines, words and characters of different scripts before feeding them to the OCRs of individual scripts. Many approaches have been proposed by the researchers towards script recognition at different levels (Block, Line, Word and Character Level). Normally Indian documents, in any its state language contain English words mixed with other words in its own state language. In this paper, we extract three different types of features: Structural, Gabor and Discrete Cosine Transforms(DCT) Features from Isolated English and Gurmukhi words and compare their script recognition performance using three different classifiers: Support Vector Machine (SVM), k-Nearest Neighbor and Parzen Probabilistic Neural Network (PNN).","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132114771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DAR '12Pub Date : 2012-12-16DOI: 10.1145/2432553.2432558
Rim Walha, Fadoua Drira, Frank Lebourgeois, A. Alimi
{"title":"Super-resolution of single text image by sparse representation","authors":"Rim Walha, Fadoua Drira, Frank Lebourgeois, A. Alimi","doi":"10.1145/2432553.2432558","DOIUrl":"https://doi.org/10.1145/2432553.2432558","url":null,"abstract":"This paper addresses the problem of generating a super-resolved text image from a single low-resolution image. The proposed Super-Resolution (SR) method is based on sparse coding which suggests that image patches can be well represented as a sparse linear combination of elements from a suitably chosen learned dictionary. Toward this strategy, a High-Resolution/Low-Resolution (HR/LR) patch pair data base is collected from high quality character images. To our knowledge, it is the first generic database allowing SR of text images may be contained in documents, signs, labels, bills, etc. This database is used to train jointly two dictionaries. The sparse representation of a LR image patch from the first dictionary can be applied to generate a HR image patch from the second dictionary. The performance of such approach is evaluated and compared visually and quantitatively to other existing SR methods applied to text images. In addition, we examine the influence of text image resolution on automatic recognition performance and we further justify the effectiveness of the proposed SR method compared to others.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128016871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DAR '12Pub Date : 2012-12-16DOI: 10.1145/2432553.2432576
Sanjoy Pratihar, Partha Bhowmick, S. Sural, J. Mukhopadhyay
{"title":"Detection and removal of hand-drawn underlines in a document image using approximate digital straightness","authors":"Sanjoy Pratihar, Partha Bhowmick, S. Sural, J. Mukhopadhyay","doi":"10.1145/2432553.2432576","DOIUrl":"https://doi.org/10.1145/2432553.2432576","url":null,"abstract":"A novel algorithm for detection and removal of underlines present in a scanned document page is proposed. The underlines treated here are hand-drawn and of various patterns. One of the important features of these underlines is that they are drawn by hand in almost a horizontal fashion. To locate these underlines, we detect the edges of their covers as a sequence of approximately straight segments, which are grown horizontally. The novelty of the algorithm lies in the detection of almost straight segments from the boundary edge map of the underline parts. After getting the exact cover of the underlines, an effective strategy is taken for underline removal. Experimental results are given to show the efficiency and robustness of the method.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130960505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DAR '12Pub Date : 2012-12-16DOI: 10.1145/2432553.2432570
Soumyadeep Dey, J. Mukhopadhyay, S. Sural, Partha Bhowmick
{"title":"Margin noise removal from printed document images","authors":"Soumyadeep Dey, J. Mukhopadhyay, S. Sural, Partha Bhowmick","doi":"10.1145/2432553.2432570","DOIUrl":"https://doi.org/10.1145/2432553.2432570","url":null,"abstract":"In this paper, we propose a technique for removing margin noise (both textual and non-textual noise) from scanned document images. We perform layout analysis to detect words, lines, and paragraphs in the document image. These detected elements are classified into text and non-text components on the basis of their characteristics (size, position, etc.). The geometric properties of the text blocks are sought to detect and remove the margin noise. We evaluate our algorithm on several scanned pages of Bengali literature books.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131106642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DAR '12Pub Date : 2012-12-16DOI: 10.1145/2432553.2432577
P. P. Kumar, C. Bhagvati, A. Agarwal
{"title":"On performance analysis of end-to-end OCR systems of Indic scripts","authors":"P. P. Kumar, C. Bhagvati, A. Agarwal","doi":"10.1145/2432553.2432577","DOIUrl":"https://doi.org/10.1145/2432553.2432577","url":null,"abstract":"Performance evaluation of End-to-End OCR systems of Indic scripts requires matching of UNICODE sequences of OCR output and ground truth. In the literature, Levenshtein edit distance has been used to compute error rates of OCR systems but the accuracies are not explicitly reported. In the present work, we have proposed an accuracy measure based on edit distance and used it in conjunction with error rate to report the performance of an OCR system. We have analyzed the relationship between accuracy and error rates in a quantitative manner. Our analysis has shown that accuracy and error rate are independent of each other and so both are needed to report complete performance of an OCR system. Proposed approach is applicable to all the Indic scripts and the experimental results on different scripts like Devanagari, Telugu, Kannada etc. are shown.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114947704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DAR '12Pub Date : 2012-12-16DOI: 10.1145/2432553.2432556
S. Belhe, Chetan Paulzagade, Akash Deshmukh, Saumya Jetley, Kapil Mehrotra
{"title":"Hindi handwritten word recognition using HMM and symbol tree","authors":"S. Belhe, Chetan Paulzagade, Akash Deshmukh, Saumya Jetley, Kapil Mehrotra","doi":"10.1145/2432553.2432556","DOIUrl":"https://doi.org/10.1145/2432553.2432556","url":null,"abstract":"The proposed approach performs recognition of online handwritten isolated Hindi words using a combination of HMMs trained on Devanagari symbols and a tree formed by the multiple, possible sequences of recognized symbols.\u0000 In general, words in Indic languages are composed of a number of aksharas or syllables, which in turn are formed by groups of consonants and vowel modifiers. Segmentation of aksharas is critical to accurate recognition of both recognition primitives as well as the complete word. Also, recognition in itself is an intricate job. This holistic task of akshara segmentation, symbol identification and subsequent word recognition is targeted in our work. It is handled in an integrated segmentation-recognition framework. By making use of online stroke information for postulating symbol candidates and deriving HOG feature set from their image counterparts, the recognition becomes independent of stroke order and stroke shape variations. Thus, the system is well suited to unconstrained handwriting.\u0000 Data for this work is collected from different parts of India where Hindi language is predominantly in use. Symbols extracted from 60,000 words are used to train and test 140 symbol-HMM models. The system is designed to output one or more candidate words to the user, by tracing multiple tree paths (up to leaf nodes) under the condition that the symbol likelihood (confidence score) at every node is above threshold. Tests performed on 10,000 words yield an accuracy of 89%.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131049408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DAR '12Pub Date : 2012-12-16DOI: 10.1145/2432553.2432555
Anand Mishra, Naveen Sankaran, Viresh Ranjan, C. V. Jawahar
{"title":"Automatic localization and correction of line segmentation errors","authors":"Anand Mishra, Naveen Sankaran, Viresh Ranjan, C. V. Jawahar","doi":"10.1145/2432553.2432555","DOIUrl":"https://doi.org/10.1145/2432553.2432555","url":null,"abstract":"Text line segmentation is a basic step in any OCR system. Its failure deteriorates the performance of OCR engines. This is especially true for the Indian languages due to the nature of scripts. Many segmentation algorithms are proposed in literature. Often these algorithms fail to adapt dynamically to a given page and thus tend to yield poor segmentation for some specific regions or some specific pages. In this work we design a text line segmentation post processor which automatically localizes and corrects the segmentation errors. The proposed segmentation post processor, which works in a \"learning by examples\" framework, is not only independent to segmentation algorithms but also robust to the diversity of scanned pages.\u0000 We show over 5% improvement in text line segmentation on a large dataset of scanned pages for multiple Indian languages.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129080933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DAR '12Pub Date : 2012-12-16DOI: 10.1145/2432553.2432574
D. Dutta, Aruni Roy Chowdhury, U. Bhattacharya, S. K. Parui
{"title":"Lightweight user-adaptive handwriting recognizer for resource constrained handheld devices","authors":"D. Dutta, Aruni Roy Chowdhury, U. Bhattacharya, S. K. Parui","doi":"10.1145/2432553.2432574","DOIUrl":"https://doi.org/10.1145/2432553.2432574","url":null,"abstract":"Here, we present our recent attempt to develop a lightweight handwriting recognizer suitable for resource constrained handheld devices. Such an application requires real-time recognition of handwritten characters produced on their touchscreens. The proposed approach is well suited for minimal user-lag on devices having only limited computing power in sharp contrast to standard laptops or desktop computers. Moreover, the approach is user-adaptive in the sense that it can adapt through user corrections to wrong predictions. With an increasing number of interactive corrections by the user, the recognition accuracy improves significantly. An input stroke is first re-sampled generating a fixed small number of sample points such that at most two critical points (points corresponding to high curvature) are preserved. We use their x- and y-coordinates as the feature vector and do not compute any other high-level feature vector. The squared Mahalanobis distance is used to identify each stroke of the input sample as one of several stroke categories pre-determined based on a large pool of training samples. The inverted covariance matrix and mean vector for a stroke class that are required for computing the Mahalanobis distance are pre-calculated and stored as Serialized Objects on the SD card of the device. A Look-Up Table (LUT) of stroke combinations as keys and corresponding character class as values is used for the final Unicode character output. In case of an incorrect character output, user corrections are used to automatically update the LUT adapting to the user's particular handwriting style.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132354328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DAR '12Pub Date : 2012-12-16DOI: 10.1145/2432553.2432565
Zhixin Shi, S. Setlur, V. Govindaraju
{"title":"Model based table cell detection and content extraction from degraded document images","authors":"Zhixin Shi, S. Setlur, V. Govindaraju","doi":"10.1145/2432553.2432565","DOIUrl":"https://doi.org/10.1145/2432553.2432565","url":null,"abstract":"This paper describes a novel method for detection and extraction of contents of table cells from handwritten document images. Given a model of the table and a document image containing a table, the hand-drawn or pre-printed table is detected and the contents of the table cells are extracted automatically. The algorithms described are designed to handle degraded binary document images. The target images may include a wide variety of noise, ranging from clutter noise, salt-and-pepper noise to non-text objects such as graphics and logos.\u0000 The presented algorithm effectively eliminates extraneous noise and identifies potentially matching table layout candidates by detecting horizontal and vertical table line candidates. A table is represented as a matrix based on the locations of intersections of horizontal and vertical table lines, and a matching algorithm searches for the best table structure that matches the given layout model and using the matching score to eliminate spurious table line candidates. The optimally matched table candidate is then used for cell content extraction.\u0000 This method was tested on a set of document page images containing tables from the challenge set of the DARPA MADCAT Arabic handwritten document image data. Preliminary results indicate that the method is effective and is capable of reliably extracting text from the table cells.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114165693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DAR '12Pub Date : 2012-12-16DOI: 10.1145/2432553.2432569
Gurpreet Singh Lehal
{"title":"Choice of recognizable units for URDU OCR","authors":"Gurpreet Singh Lehal","doi":"10.1145/2432553.2432569","DOIUrl":"https://doi.org/10.1145/2432553.2432569","url":null,"abstract":"There has been considerable work on Arabic OCR. However, all that work is based on Naskh style. Urdu script is based on Arabic alphabet, but uses Nastalique style. The Nastalique style makes OCR in general and character segmentation in particular, a highly challenging task, so most of the researchers avoid the character segmentation phase and go in for higher unit of recognition. For Urdu, the next higher recognition unit considered by researchers is ligature, which lies between character and word. A ligature is a connected component of one or more characters and usually an Urdu word is composed of 1 to 8 ligatures. A related issue is identification of all possible ligatures for recognition purpose. For this purpose, we have performed a statistical analysis of Urdu corpus to collect and organise the Urdu ligatures. The number of unique ligatures comes to be more than 26,000, and recognition of such a huge class is again a Herculean task. It becomes necessary to reduce the class count and look for alternative recognition unit. From OCR point of view, a ligature can further be segmented into one primary connected component and zero or more secondary connected components. The primary component represents the basic shape of the ligature, while the secondary connected component corresponds to the dots and diacritics marks and special symbols associated with the ligature. To reduce the class count, the ligatures with similar primary components are clubbed together. Further statistical analysis is performed to count and arrange in descending order the primary components and a manageable class of around 2300 recognition units has been generated, which covers 99% of Urdu corpus.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126956151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}