{"title":"A Segmentation-Free Handwritten Word Spotting Approach by Relaxed Feature Matching","authors":"A. Hast, A. Fornés","doi":"10.1109/DAS.2016.40","DOIUrl":"https://doi.org/10.1109/DAS.2016.40","url":null,"abstract":"The automatic recognition of historical handwritten documents is still considered a challenging task. For this reason, word spotting emerges as a good alternative for making the information contained in these documents available to the user. Word spotting is defined as the task of retrieving all instances of the query word in a document collection, becoming a useful tool for information retrieval. In this paper we propose a segmentation-free word spotting approach able to deal with large document collections. Our method is inspired on feature matching algorithms that have been applied to image matching and retrieval. Since handwritten words have different shape, there is no exact transformation to be obtained. However, the sufficient degree of relaxation is achieved by using a Fourier based descriptor and an alternative approach to RANSAC called PUMA. The proposed approach is evaluated on historical marriage records, achieving promising results.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123682488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nihel Kooli, A. Belaïd, Aurélie Joseph, V. P. d'Andecy
{"title":"Entity Local Structure Graph Matching for Mislabeling Correction","authors":"Nihel Kooli, A. Belaïd, Aurélie Joseph, V. P. d'Andecy","doi":"10.1109/DAS.2016.36","DOIUrl":"https://doi.org/10.1109/DAS.2016.36","url":null,"abstract":"This paper proposes an entity local structure comparison approach based on inexact subgraph matching. The comparison results are used for mislabeling correction in the local structure. The latter represents a set of entity attribute labels which are physically close in a document image. It is modeled by an attributed graph describing the content and presentation features of the labels by the nodes and the geometrical features by the arcs. A local structure graph is matched with a structure model which represents a set of local structure model graphs. The structure model is initially built using a set of well chosen local structures based on a graph clustering algorithm and is then incrementally updated. The subgraph matching adopts a specific cost function that integrates the feature dissimilarities. The matched model graph is used to extract the missed labels, prune the extraneous ones and correct the erroneous label fields in the local structure. The evaluation of the structure comparison approach on 525 local structures extracted from 200 business documents achieves about 90% for recall and 95% for precision. The mislabeling correction rates in these local structures vary between 73% and 100%.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132026281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Berenguel, O. R. Terrades, J. Lladós, C. Cañero
{"title":"Banknote Counterfeit Detection through Background Texture Printing Analysis","authors":"A. Berenguel, O. R. Terrades, J. Lladós, C. Cañero","doi":"10.1109/DAS.2016.34","DOIUrl":"https://doi.org/10.1109/DAS.2016.34","url":null,"abstract":"This paper is focused on the detection of counterfeit photocopy banknotes. The main difficulty is to work on a real industrial scenario without any constraint about the acquisition device and with a single image. The main contributions of this paper are twofold: first the adaptation and performance evaluation of existing approaches to classify the genuine and photocopy banknotes using background texture printing analysis, which have not been applied into this context before. Second, a new dataset of Euro banknotes images acquired with several cameras under different luminance conditions to evaluate these methods. Experiments on the proposed algorithms show that mixing SIFT features and sparse coding dictionaries achieves quasi perfect classification using a linear SVM with the created dataset. Approaches using dictionaries to cover all possible texture variations have demonstrated to be robust and outperform the state-of-the-art methods using the proposed benchmark.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129898824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online Arabic Handwriting Recognition with Dropout Applied in Deep Recurrent Neural Networks","authors":"R. Maalej, Najiba Tagougui, M. Kherallah","doi":"10.1109/DAS.2016.49","DOIUrl":"https://doi.org/10.1109/DAS.2016.49","url":null,"abstract":"Lately, Online Arabic Handwriting Recognition has been gaining more interest because of the advances in technology such as the handwriting capturing devices and impressive mobile computers. And since we always try to improve recognition rates, we propose in this work a new system based on a deep recurrent neural networks on which the dropout technique was applied. Our approach is very practical in sequence modelling due to their recurrent connections, also it can learn intricate relationship between input and output layers because of many non-linear hidden layers. In addition to these contributions, our system is protected against overfitting due to powerful performance of dropout. This proposed system was tested with a large dataset ADAB to show its performance against difficult conditions as the variety of writers, the large vocabulary and diversity of style.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130796322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Simple and Effective Solution for Script Identification in the Wild","authors":"A. Singh, Anand Mishra, P. Dabral, C. V. Jawahar","doi":"10.1109/DAS.2016.57","DOIUrl":"https://doi.org/10.1109/DAS.2016.57","url":null,"abstract":"We present an approach for automatically identifying the script of the text localized in the scene images. Our approach is inspired by the advancements in mid-level features. We represent the text images using mid-level features which are pooled from densely computed local features. Once text images are represented using the proposed mid-level feature representation, we use an off-the-shelf classifier to identify the script of the text image. Our approach is efficient and requires very less labeled data. We evaluate the performance of our method on a recently introduced CVSI dataset, demonstrating that the proposed approach can correctly identify script of 96.70% of the text images. In addition, we also introduce and benchmark a more challenging Indian Language Scene Text (ILST) dataset for evaluating the performance of our method.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128478058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Preserving Text Content from Historical Handwritten Documents","authors":"Arpita Chakraborty, M. Blumenstein","doi":"10.1109/DAS.2016.77","DOIUrl":"https://doi.org/10.1109/DAS.2016.77","url":null,"abstract":"We propose a holistic, dynamic method to preserve text content with zero tolerance while removing marginal noise for historical handwritten document images. The key idea is to identify and analyze the region between the sharp peak at the edge and page frame of the text content at each margin. Depending on the proximity of the sharp peak to the text, the text content is then extracted from the document image. This method automatically adapts thresholds for each single document image and is directly applicable to gray-scale images. The proposed method is evaluated on four diverse handwritten historical datasets: Queensland State Archive (QSA), Saint Gall, Parzival and the Prosecution Project. Experimental results show that the proposed method achieves higher accuracy compared with other methods tested on the Saint Gall and Parzival datasets, whilst for the other two Australian datasets, which have been introduced here for the first time, the results are very encouraging.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126624766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance of an Off-Line Signature Verification Method Based on Texture Features on a Large Indic-Script Signature Dataset","authors":"S. Pal, Alireza Alaei, U. Pal, M. Blumenstein","doi":"10.1109/DAS.2016.48","DOIUrl":"https://doi.org/10.1109/DAS.2016.48","url":null,"abstract":"In this paper, a signature verification method based on texture features involving off-line signatures written in two different Indian scripts is proposed. Both Local Binary Patterns (LBP) and Uniform Local Binary Patterns (ULBP), as powerful texture feature extraction techniques, are used for characterizing off-line signatures. The Nearest Neighbour (NN) technique is considered as the similarity metric for signature verification in the proposed method. To evaluate the proposed verification approach, a large Bangla and Hindi off-line signature dataset (BHSig260) comprising 6240 (260×24) genuine signatures and 7800 (260×30) skilled forgeries was introduced and further used for experimentation. We further used the GPDS-100 signature dataset for a comparison. The experiments were conducted, and the verification accuracies were separately computed for the LBP and ULBP texture features. There were no remarkable changes in the results obtained applying the LBP and ULBP features for verification when the BHSig260 and GPDS-100 signature datasets were used for experimentation.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114801990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Marginal Noise Reduction in Historical Handwritten Documents -- A Survey","authors":"Arpita Chakraborty, M. Blumenstein","doi":"10.1109/DAS.2016.78","DOIUrl":"https://doi.org/10.1109/DAS.2016.78","url":null,"abstract":"This paper presents a survey on different approaches for removing the marginal noise from document images, and anlaysing the research challenges of those methods relating to handwritten historical datasets. In this survey, historical documents collected from Australian Archives and Libraries are introduced and the associated layout complexities of those document images are also described. Benchmarking other historical databases related to this work is also discussed. This survey discusses the difficulties and suitability of the state-of-the-art methods to remove marginal noise as well as preserving the text content from handwritten historical documents. This survey helps researchers to identify appropriate methods according to the associated marginal noise and also illustrates their drawbacks in order to make suggestions for developing approaches, which are more general and robust for any datasets.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114929870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sarah Elkasrawi, A. Dengel, Ahmed Abdelsamad, S. S. Bukhari
{"title":"What You See is What You Get? Automatic Image Verification for Online News Content","authors":"Sarah Elkasrawi, A. Dengel, Ahmed Abdelsamad, S. S. Bukhari","doi":"10.1109/DAS.2016.75","DOIUrl":"https://doi.org/10.1109/DAS.2016.75","url":null,"abstract":"Consuming news over online media has witnessed rapid growth in recent years, especially with the increasing popularity of social media. However, the ease and speed with which users can access and share information online facilitated the dissemination of false or unverified information. One way of assessing the credibility of online news stories is by examining the attached images. These images could be fake, manipulated or not belonging to the context of the accompanying news story. Previous attempts to news verification provided the user with a set of related images for manual inspection. In this work, we present a semi-automatic approach to assist news-consumers in instantaneously assessing the credibility of information in hypertext news articles by means of meta-data and feature analysis of images in the articles. In the first phase, we use a hybrid approach including image and text clustering techniques for checking the authenticity of an image. In the second phase, we use a hierarchical feature analysis technique for checking the alteration in an image, where different sets of features, such as edges and SURF, are used. In contrast to recently reported manual news verification, our presented work shows a quantitative measurement on a custom dataset. Results revealed an accuracy of 72.7% for checking the authenticity of attached images with a dataset of 55 articles. Finding alterations in images resulted in an accuracy of 88% for a dataset of 50 images.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117343938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Isolated Handwritten Digit Recognition Using oBIFs and Background Features","authors":"A. Gattal, Chawki Djeddi, Y. Chibani, I. Siddiqi","doi":"10.1109/DAS.2016.10","DOIUrl":"https://doi.org/10.1109/DAS.2016.10","url":null,"abstract":"This study demonstrates how the combination of oriented Basic Image Features (oBIFs) with the background concavity features can be effectively employed to enhance the performance of isolated digit recognition systems. The features are extracted without any size normalization from the complete image as well as from different regions of the image by applying a uniform grid sampling to the image. Classification is carried out using one-against-all support vector machine (SVM) while the experimental study is conducted on the standard CVL single digit database. A series of evaluations using different feature configurations and combinations realized high recognition rates which are compared with the state-of-the-art methods on this subject.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114978182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}