{"title":"Performance Improvement in Local Feature Based Camera-Captured Character Recognition","authors":"Takahiro Matsuda, M. Iwamura, K. Kise","doi":"10.1109/DAS.2014.78","DOIUrl":"https://doi.org/10.1109/DAS.2014.78","url":null,"abstract":"Concerning camera-captured Japanese character recognition, we have proposed a method to recognize characters, both simple and complex, that may not be linearly aligned and may be printed with a complex background. Recognition is performed based on local features and their arrangement. The arrangement is validated with an algorithm called local RANSAC. However, at least four corresponding local features are required. To relax that condition, we propose a new recognition method making it possible to recognize a character region with at least three corresponding local features. This method enables recall and precision to be improved with the simpler characters using more corresponding local features and computation times to be reduced by 7%.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116922539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrey C. Sariev, Vladislav Nenchev, Stefan Gerdjikov, Petar Mitankin, Hristo Ganchev, S. Mihov, Tinko Tinchev
{"title":"Flexible Noisy Text Correction","authors":"Andrey C. Sariev, Vladislav Nenchev, Stefan Gerdjikov, Petar Mitankin, Hristo Ganchev, S. Mihov, Tinko Tinchev","doi":"10.1109/DAS.2014.12","DOIUrl":"https://doi.org/10.1109/DAS.2014.12","url":null,"abstract":"We present a new general and language independent approach to the noisy text correction problem developed and implemented in the framework of the CULTURA project. We briefly describe the core candidate generator, REBELS, the complete system concept, its efficient implementation based on functional automata and its immediate applications. The quality of the whole system is empirically established in different experimental settings where language and noise sources are varied.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124034292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spotting Symbol Using Sparsity over Learned Dictionary of Local Descriptors","authors":"T. Do, S. Tabbone, O. R. Terrades","doi":"10.1109/DAS.2014.62","DOIUrl":"https://doi.org/10.1109/DAS.2014.62","url":null,"abstract":"This paper proposes a new approach to spot symbols into graphical documents using sparse representations. More specifically, a dictionary is learned from a training database of local descriptors defined over the documents. Following their sparse representations, interest points sharing similar properties are used to define interest regions. Using an original adaptation of information retrieval techniques, a vector model for interest regions and for a query symbol is built based on its sparsity in a visual vocabulary where the visual words are columns in the learned dictionary. The matching process is performed comparing the similarity between vector models. Evaluation on SESYD datasets demonstrates that our method is promising.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123095693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Hierarchical Framework for Accent Based Writer Identification","authors":"Chetan Ramaiah, V. Govindaraju","doi":"10.1109/DAS.2014.69","DOIUrl":"https://doi.org/10.1109/DAS.2014.69","url":null,"abstract":"Writer identification is the process of determining the author of a handwritten specimen by utilizing characteristics inherent in the sample. In this work, we apply the concept of accents in handwriting to introduce a novel perspective for writer identification. Analogous to speech, accents in handwriting can be defined as distinctive writing quirks that are unique to a group of people sharing a common native script. Specifically, we postulate that a group of people with a common native script will share certain traits in their handwriting style that are exposed when they write in a different script. We propose a hierarchical framework for the writer identification task, wherein, we first identify the accent of the writer. In the next step, we perform writer identification based on the selected accent. This framework reduces the complexity of the classification task by reducing the number of classes at the prediction stage. Experiments are performed on the UNIPEN dataset and the results lend credibility to our model.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125316102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Study to Achieve Manga Character Retrieval Method for Manga Images","authors":"M. Iwata, Atsushi Ito, K. Kise","doi":"10.1109/DAS.2014.60","DOIUrl":"https://doi.org/10.1109/DAS.2014.60","url":null,"abstract":"Manga (Japanese style comics) is one of the most popular publications. Nowadays manga is often handled as digital images not only in consumers' use but also in digital media. However, they hardly handle manga as content-based materials. Some digital media use tags or text data for retrieval, where the tags and text data are produced by handmade input. Therefore our goal is achieving content-based retrieval method for manga images. As the first step to the goal, we investigate the performance of Sun's method applying to manga character retrieval. Manga character retrieval means a image retrieval of which the input and output are a character image and page images where the input character appears respectively. It is useful for convenient use of manga images, for example, character retrieval or auto-tagging. We modify Sun's method so as to be applicable to manga character retrieval and then investigate the performance.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123352516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multilingual Off-Line Handwriting Recognition in Real-World Images","authors":"M. Kozielski, P. Doetsch, M. Hamdani, H. Ney","doi":"10.1109/DAS.2014.8","DOIUrl":"https://doi.org/10.1109/DAS.2014.8","url":null,"abstract":"We propose a state-of-the-art system for recognizing real-world handwritten images exposing a huge degree of noise and a high out-of-vocabulary rate. We describe methods for successful image demising, line removal, deskewing, deslanting, and text line segmentation. We demonstrate how to use a HMM-based recognition system to obtain competitive results, and how to further improve it using LSTM neural networks in the tandem approach. The final system outperforms other approaches on a new dataset for English and French handwriting. The presented framework scales well across other standard datasets.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"321 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123648855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Context-Dependent Confusions Rules for Building Error Model Using Weighted Finite State Transducers for OCR Post-Processing","authors":"M. A. Azawi, T. Breuel","doi":"10.1109/DAS.2014.75","DOIUrl":"https://doi.org/10.1109/DAS.2014.75","url":null,"abstract":"In this paper, we propose a new technique to correct the OCR errors by means of weighted finite state transducers(WFST) with context-dependent confusion rules. We translate the OCR confusions which appear in the recognition outputs into edit operations, e.g. insertions, deletions and substitutions using Levenshtein edit distance algorithm. The edit operations are extracted in a form of rules with respect to the context of the incorrect string to build an error model using weighted finite state transducers. The context-dependent rules help to fit the rule in the appropriate strings. Our new error model avoids the calculations that occur in searching the language model and it also makes the language model eligible to correct incorrect words by using context-dependent confusion rules. Our approach is language independent. It designed to deal with different number of errors. It has no limited words size. In the set of experiments conducted on the ocred pages from the UWIII dataset, our new proposed error model outperforms. The evaluation shows the error rate of our model on the UWIII testset is 0.68%, while the baseline is 1.14% and the error rate of the existing state-of-the-art single character rules-based approach is 1.0%.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124702184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Complete Logo Detection/Recognition System for Document Images","authors":"Alireza Alaei, Mathieu Delalandre","doi":"10.1109/DAS.2014.79","DOIUrl":"https://doi.org/10.1109/DAS.2014.79","url":null,"abstract":"In this paper, a complete logo detection/ recognition system for document images is proposed. In the proposed system, first, a logo detection method is employed to detect a few regions of interest (logo-patches), which likely contain the logo(s), in a document image. The detection method is based on the piece-wise painting algorithm (PPA) and some probability features along with a decision tree. For the logo recognition, a template based recognition approach is proposed to recognize the logo which may present in every detected logo-patch. The proposed logo recognition strategy uses a search space reduction technique to decrease the number of template logo-models needed for the recognition of a logo in a detected logo-patch. The features used for search space reduction are based on the geometric properties of a detected logo-patch. Based on our experimentations on 1290 document images of Tobacco800 dataset, 99.31% of the logos were detected as logo-patches. Among the detected logo-patches 97.90% of logos were fairly recognized. Considering both logo detection and recognition results, 97.22% of the logos in the document images could truly be detected/recognized as the overall performance of the proposed system.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125118781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Influence of Language Orthographic Characteristics on Digital Word Recognition","authors":"Ofer Biller, Jihad El-Sana, K. Kedem","doi":"10.1093/llc/fqu051","DOIUrl":"https://doi.org/10.1093/llc/fqu051","url":null,"abstract":"We study the effect of language orthographic characteristics on the performance of digital word recognition in degraded documents such as historical documents. We provide a rigorous scheme for quantifying the influence of the orthographic characteristics on the quality of word recognition in such documents. We study and compare several orthographic characteristics for four natural languages and measure the effect of each individual characteristic on the digital word recognition process. To this end we create synthetic languages, for which all characteristics, except the one we examine, are identical, and measure the performance of two word recognition algorithms on synthetic documents of these languages. We examine and summarize the influence of the values of each characteristic on the performance of these word recognition methods.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125616678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Praveen Krishnan, Naveen Sankaran, A. Singh, C. V. Jawahar
{"title":"Towards a Robust OCR System for Indic Scripts","authors":"Praveen Krishnan, Naveen Sankaran, A. Singh, C. V. Jawahar","doi":"10.1109/DAS.2014.74","DOIUrl":"https://doi.org/10.1109/DAS.2014.74","url":null,"abstract":"The current Optical Character Recognition OCR systems for Indic scripts are not robust enough for recognizing arbitrary collection of printed documents. Reasons for this limitation includes the lack of resources (e.g. not enough examples with natural variations, lack of documentation available about the possible font/style variations) and the architecture which necessitates hard segmentation of word images followed by an isolated symbol recognition. Variations among scripts, latent symbol to UNICODE conversion rules, non-standard fonts/styles and large degradations are some of the major reasons for the unavailability of robust solutions. In this paper, we propose a web based OCR system which (i) follows a unified architecture for seven Indian languages, (ii) is robust against popular degradations, (iii) follows a segmentation free approach, (iv) addresses the UNICODE re-ordering issues, and (v) can enable continuous learning with user inputs and feedbacks. Our system is designed to aid the continuous learning while being usable i.e., we capture the user inputs (say example images) for further improving the OCRs. We use the popular BLSTM based transcription scheme to achieve our target. This also enables incremental training and refinement in a seamless manner. We report superior accuracy rates in comparison with the available OCRs for the seven Indian languages.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132877344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}