2016 12th IAPR Workshop on Document Analysis Systems (DAS)最新文献_第7页

Making Europe's Historical Newspapers Searchable 让欧洲历史报纸变得可搜索

2016 12th IAPR Workshop on Document Analysis Systems (DAS) Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.83

Clemens Neudecker, A. Antonacopoulos

引用次数: 25

Election Tally Sheets Processing System 选举点票表处理系统

2016 12th IAPR Workshop on Document Analysis Systems (DAS) Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.37

J. I. Toledo, A. Fornés, Jordi Cucurull-Juan, J. Lladós

引用次数: 0

Understanding Line Plots Using Bayesian Network 使用贝叶斯网络理解线形图

2016 12th IAPR Workshop on Document Analysis Systems (DAS) Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.73

Rathin Radhakrishnan Nair, Nishant Sankaran, Ifeoma Nwogu, V. Govindaraju

{"title":"Understanding Line Plots Using Bayesian Network","authors":"Rathin Radhakrishnan Nair, Nishant Sankaran, Ifeoma Nwogu, V. Govindaraju","doi":"10.1109/DAS.2016.73","DOIUrl":"https://doi.org/10.1109/DAS.2016.73","url":null,"abstract":"Information graphics, such as bar charts, graphs, plots etc. in scientific documents primarily facilitate better understanding of information. Graphics are a key component in technical documents as they are simplified representations of complex ideas. When the traditional optical character recognition (OCR) systems is used on digitized documents, we lose the ideas conveyed in these information graphics since OCRs typically work only on text. And although in more recent times, tools have been developed to extract information graphics from pdf files, they still do not intelligently interpret the contents of the extracted graphics. We therefore propose a method for identifying the intended messages of line plots using a Bayesian network. We accomplish this by first extracting a dense set of points in from a line plot and then represent the entire line plot as a sequence of trends. We then implement a Bayesian network for reasoning about the messages conveyed by the line plots and their trends. We validate our approach by performing experiments on a dataset obtained from computer science conference publications and evaluate the performance of the network against the messages generated by human end users. The resulting intended message gives holistic information about the line plot(s) as well as lower level information about the trends that make up the plot.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133432344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

SDK Reinvented: Document Image Analysis Methods as RESTful Web Services 重新发明的SDK:文档图像分析方法作为RESTful Web服务

2016 12th IAPR Workshop on Document Analysis Systems (DAS) Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.56

Marcel Würsch, R. Ingold, M. Liwicki

{"title":"SDK Reinvented: Document Image Analysis Methods as RESTful Web Services","authors":"Marcel Würsch, R. Ingold, M. Liwicki","doi":"10.1109/DAS.2016.56","DOIUrl":"https://doi.org/10.1109/DAS.2016.56","url":null,"abstract":"Document Image Analysis (DIA) systems become ever more advanced, but also more complex -- computationally, and logically. This increases the difficulty of integrating existing state-of-the-art approaches into new research or into practical workflows. The current approach to sharing software is publishing source code -- leaving the burden to the integrator -- or creating a Software Development Kit (SDK) which is often restricted to one programming language. We present DIVAServices a framework for sharing and accessing DIA methods within the research community and beyond. Using a RESTful web service architecture we provide access to the methods, leading to only one system on which the binaries of methods need to be maintained. All it takes for a developer to use an algorithm is a simple HTTP request with the image data and parameters for the method and they will receive the computed results in a format that allows for seamless integration into any kind of workflow or for further processing. Furthermore, DIVAServices is open-source, enabling other research groups or libraries to host their own instance in their environment. Using this framework, future DIA systems can be built on the shoulders of well tested algorithms, accessible to everyone.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134367542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Multilingual OCR for Indic Scripts 印度脚本的多语言OCR

2016 12th IAPR Workshop on Document Analysis Systems (DAS) Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.68

Minesh Mathew, A. Singh, C. V. Jawahar

引用次数: 41

MSIO: MultiSpectral Document Image BinarizatIOn MSIO:多光谱文档图像二值化

2016 12th IAPR Workshop on Document Analysis Systems (DAS) Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.39

Markus Diem, Fabian Hollaus, Robert Sablatnig

引用次数: 10

An Interactive Approach with Off-Line and On-Line Handwritten Text Recognition Combination for Transcribing Historical Documents 联机与离线手写体文本识别相结合的历史文献转录交互方法

2016 12th IAPR Workshop on Document Analysis Systems (DAS) Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.45

Emilio Granell, Verónica Romero, C. Martínez-Hinarejos

{"title":"An Interactive Approach with Off-Line and On-Line Handwritten Text Recognition Combination for Transcribing Historical Documents","authors":"Emilio Granell, Verónica Romero, C. Martínez-Hinarejos","doi":"10.1109/DAS.2016.45","DOIUrl":"https://doi.org/10.1109/DAS.2016.45","url":null,"abstract":"Automatic transcription of historical documents is becoming an important research topic, specially because of the increasing number of digitised historical documents that libraries and archives are publishing. However, state-of-the-art handwritten text recognition systems are far from being perfect. Therefore, to have perfect transcriptions, human expert revision is required to really produce a transcription of standard quality. In this context, an interactive assistive scenario, where the automatic system and the human transcriber cooperate to generate the perfect transcription, would allow for a more effective approach. In this paper we present a multimodal interactive transcription system where user feedback is provided by means of touchscreen pen strokes, traditional keyboard and mouse operations. The combination of both the main and the feedback data stream is based on the use of Confusion Networks derived from the output of the on-line and off-line handwritten text recognition systems. The use of the proposed combination help to optimise overall performance and usability.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116985684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

OCR Accuracy Prediction Method Based on Blur Estimation 基于模糊估计的OCR精度预测方法

2016 12th IAPR Workshop on Document Analysis Systems (DAS) Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.50

V. C. Kieu, F. Cloppet, N. Vincent

引用次数: 9

Natural Scene Character Recognition Using Robust PCA and Sparse Representation 基于鲁棒PCA和稀疏表示的自然场景字符识别

2016 12th IAPR Workshop on Document Analysis Systems (DAS) Pub Date : 2016-04-01 DOI: 10.1109/DAS.2016.32

Zheng Zhang, Yong Xu, Cheng-Lin Liu

引用次数: 6

Visual Analysis System for Features and Distances Qualitative Assessment: Application to Word Image Matching 特征与距离定性评价的视觉分析系统:在文字图像匹配中的应用

2016 12th IAPR Workshop on Document Analysis Systems (DAS) Pub Date : 2016-04-01 DOI: 10.1109/DAS.2016.17

Frédéric Rayar, T. Mondal, Sabine Barrat, F. Bouali, G. Venturini

{"title":"Visual Analysis System for Features and Distances Qualitative Assessment: Application to Word Image Matching","authors":"Frédéric Rayar, T. Mondal, Sabine Barrat, F. Bouali, G. Venturini","doi":"10.1109/DAS.2016.17","DOIUrl":"https://doi.org/10.1109/DAS.2016.17","url":null,"abstract":"In this paper, a visual analysis system to qualitatively assess the features and distance functions that are used for calculating dissimilarity between two word images is presented. Computation of dissimilarity between two images is the prerequisite for image matching, indexing and retrieval problems. First, the features are extracted from the word images and a distance between each image to others is computed and represented in a matrix form. Then, based on this distance matrix, a proximity graph is built to structure the set of word images and highlight their topology. The proposed visual analysis system is a web based platform that allows visualisation and interactions on the obtained graph. This interactive visualisation tool inherently helps users to quickly analyse and understand the relevance and robustness of selected features and corresponding distance function in a unsupervised way, i.e. without any ground truth. Experiments are performed on a handwritten dataset of segmented words. Three types of features and four distance functions are considered to describe and compare the word images. Theses material are leveraged to evaluate the relevance of the built graph, and the usefulness of the platform.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125559351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1