Sheng He, P. Samara, J. Burgers, Lambert Schomaker
{"title":"Historical Document Dating Using Unsupervised Attribute Learning","authors":"Sheng He, P. Samara, J. Burgers, Lambert Schomaker","doi":"10.1109/DAS.2016.38","DOIUrl":"https://doi.org/10.1109/DAS.2016.38","url":null,"abstract":"The date of historical documents is an important metadata for scholars using them, as they need to know the historical context of the documents. This paper presents a novel attribute representation for medieval documents to automatically estimate the date information, which are the years they had been written. Non-semantic attributes are discovered in the low-level feature space using an unsupervised attribute learning method. A negative data set is involved in the attribute learning to make sure that our system rejects the documents which are not from the Middle Ages nor from the same archives. Experimental results on the basis of the Medieval Paleographic Scale (MPS) data set demonstrate that the proposed method achieves the state-of-the-art result.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"222 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115656837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RNN Based Uyghur Text Line Recognition and Its Training Strategy","authors":"Pengchao Li, Jiadong Zhu, Liangrui Peng, Yunbiao Guo","doi":"10.1109/DAS.2016.20","DOIUrl":"https://doi.org/10.1109/DAS.2016.20","url":null,"abstract":"Uyghur language is written in a modified Arabic script. Due to its cursive nature and the lack of enough labeled training samples, Uyghur document recognition is still a challenging problem. In this paper, we propose a new Recurrent Neural Network (RNN) based Uyghur text line recognition method combining Gated Recurrent Unit (GRU) and Restricted Boltzmann Machine (RBM) with pretraining mechanism. We also present a novel curriculum learning technique guided by sample distribution information. Experimental results on practical Uyghur printed document image dataset show that the proposed network architecture and training strategy not only achieve better recognition accuracy compared with traditional methods, but can accelerate the training speed as well.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122676018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quoc Bao Dang, Marçal Rusiñol, Mickaël Coustaty, M. Luqman, De Cao Tran, J. Ogier
{"title":"Delaunay Triangulation-Based Features for Camera-Based Document Image Retrieval System","authors":"Quoc Bao Dang, Marçal Rusiñol, Mickaël Coustaty, M. Luqman, De Cao Tran, J. Ogier","doi":"10.1109/DAS.2016.66","DOIUrl":"https://doi.org/10.1109/DAS.2016.66","url":null,"abstract":"In this paper, we propose a new feature vector, named DElaunay TRIangulation-based Features (DETRIF), for real-time camera-based document image retrieval. DETRIF is computed based on the geometrical constraints from each pair of adjacency triangles in delaunay triangulation which is constructed from centroids of connected components. Besides, we employ a hashing-based indexing system in order to evaluate the performance of DETRIF and to compare it with other systems such as LLAH and SRIF. The experimentation is carried out on two datasets comprising of 400 heterogeneous-content complex linguistic map images (huge size, 9800 X 11768 pixels resolution) and 700 textual document images.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127542529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of Stroke Intersection for Overlapping PGF Elements","authors":"Yan Chen, Xiaoqing Lu, J. Qu, Zhi Tang","doi":"10.1109/DAS.2016.11","DOIUrl":"https://doi.org/10.1109/DAS.2016.11","url":null,"abstract":"Query-by-figure is an effective retrieval approach for educational documents. However, complex geometric diagrams in the field of mathematics education remain as obstacles in current retrieval systems. This study aims to explore a query method for plane geometric figures (PGFs) via sketched figures on smart mobile devices. We adopt an undirected graph model to describe PGFs and a divide-and-conquer strategy to analyze the relationships among strokes. Our main contribution is the detailed analysis of the stroke intersection that frequently occurs in PGFs. Numerous accurate elements obtained through overlapping analysis are then selected to construct strong descriptors for PGFs. Only the compressed query features, instead of a query figure, are transmitted to an image-based retrieval system located on a remote server, where the sketched PGF is finally recognized with low delay response. The experiments show that the proposed method achieves high efficiency and provides users with good interactive experience.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116367128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Searching Corrupted Document Collections","authors":"Jason J. Soo, O. Frieder","doi":"10.1109/DAS.2016.28","DOIUrl":"https://doi.org/10.1109/DAS.2016.28","url":null,"abstract":"Historical documents are typically digitized using optical Character Recognition. While effective, the results may not always be accurate and are highly dependent on the input. Consequently, degraded documents are often corrupted. Our focus is finding flexible, reliable methods to correct for such degradation, in the face of limited resources. We extend upon our substring and context fusion based retrieval system known as Segments, to consider metadata. By extracting topics from documents, and supplementing and weighting our lexicon with co-occurring terms found in documents with those topics, we achieve a statistically significant improvement over the state-of-the-art in all but one test configuration. Our mean reciprocal rank measured on two free, publicly available, independently judged datasets is 0.7657 and 0.5382.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125671789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Removal of Gray Rubber Stamps","authors":"Soumyadeep Dey, J. Mukhopadhyay, S. Sural","doi":"10.1109/DAS.2016.26","DOIUrl":"https://doi.org/10.1109/DAS.2016.26","url":null,"abstract":"Rubber stamps often overlap with original text content of a document, and hence obscure the text regions very badly. Removal of these stamp regions becomes a necessity for successful conversion of such documents into electronic format. Stamp removal from a document becomes more difficult when they are in gray scale, or text and stamp are of the same color. In this paper, we propose a technique to remove such stamps from overlapped regions by identifying stamp regions and stamp pixels.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"6 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131575587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Camera-Based System for User Friendly Annotation of Documents","authors":"Yusuke Oguma, K. Kise","doi":"10.1109/DAS.2016.62","DOIUrl":"https://doi.org/10.1109/DAS.2016.62","url":null,"abstract":"We propose a system for document annotation using a camera mounted on a smartphone. It works both on paper documents and electronic documents thanks to the functionality of document image retrieval. An important characteristic of this system is in the way of annotating documents. The system employs simple character stickers which represent user's opinions (\"hard to understand\", \"interesting\", \"boring\", \"surprising\", \"doubtful\") for friendly annotation on documents. We evaluated our system by changing the way of annotation and found that users most liked the proposed way of annotation with stickers though it sometimes caused a confusion about the interpretation of stickers. We discuss the possible way of solving this issue as a result of the analysis of experimental results.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"182 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134290964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kai Chen, Cheng-Lin Liu, Mathias Seuret, M. Liwicki, J. Hennebert, R. Ingold
{"title":"Page Segmentation for Historical Document Images Based on Superpixel Classification with Unsupervised Feature Learning","authors":"Kai Chen, Cheng-Lin Liu, Mathias Seuret, M. Liwicki, J. Hennebert, R. Ingold","doi":"10.1109/DAS.2016.13","DOIUrl":"https://doi.org/10.1109/DAS.2016.13","url":null,"abstract":"In this paper, we present an efficient page segmentation method for historical document images. Many existing methods either rely on hand-crafted features or perform rather slow as they treat the problem as a pixel-level assignment problem. In order to create a feasible method for real applications, we propose to use superpixels as basic units of segmentation, and features are learned directly from pixels. An image is first oversegmented into superpixels with the simple linear iterative clustering (SLIC) algorithm. Then, each superpixel is represented by the features of its central pixel. The features are learned from pixel intensity values with stacked convolutional autoencoders in an unsupervised manner. A support vector machine (SVM) classifier is used to classify superpixels into four classes: periphery, background, text block, and decoration. Finally, the segmentation results are refined by a connected component based smoothing procedure. Experiments on three public datasets demonstrate that compared to our previous method, the proposed method is much faster and achieves comparable segmentation results. Additionally, much fewer pixels are used for classifier training.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129372879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Keyword Retrieval Using Scale-Space Pyramid","authors":"Irina Rabaev, K. Kedem, Jihad El-Sana","doi":"10.1109/DAS.2016.16","DOIUrl":"https://doi.org/10.1109/DAS.2016.16","url":null,"abstract":"We propose a pyramid-based method for keyword spotting in historical document images. The documents are represented by a scale-space pyramid of their features. The search for a query keyword begins at the highest level of the pyramid, where the initial candidates for matching are located. The candidates are further refined at each level of the pyramid. The number of levels is adaptive and depends on the length of the query word. The results from all the document images are combined and ranked. We compare two feature representations, grid-based and continuous, and show that continuous feature representation outperforms the grid-based representation. In order to reduce the memory used to store the scale-space pyramid of features, we discuss and compare two compressing approaches. The proposed method was evaluated on four different collections of historical documents achieving state-of-the-art results.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115233292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Louloudis, Giorgos Sfikas, N. Stamatopoulos, B. Gatos
{"title":"Word Segmentation Using the Student's-t Distribution","authors":"G. Louloudis, Giorgos Sfikas, N. Stamatopoulos, B. Gatos","doi":"10.1109/DAS.2016.35","DOIUrl":"https://doi.org/10.1109/DAS.2016.35","url":null,"abstract":"Word segmentation refers to the process of defining the word regions of a text line. It is a critical stage towards word and character recognition as well as word spotting and mainly concerns three basic stages, namely preprocessing, distance computation and gap classification. In this paper, we propose a novel word segmentation method which uses the Student's-t distribution for the gap classification stage. The main advantage of the Student's-t distribution concerns its robustness to the existence of outliers. In order to test the efficiency of the proposed method we used the four benchmarking datasets of the ICDAR/ICFHR Handwriting Segmentation Contests as well as a historical typewritten dataset of Greek polytonic text. It is observed that the use of mixtures of Student's-t distributions for word segmentation outperforms other gap classification methods in terms of Recognition Accuracy and F-Measure. Also, in terms of all examined benchmarks, the Student's-t is shown to produce a perfect segmentation result in significantly more cases than the state-of-the-art Gaussian mixture model.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115494752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}