2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)最新文献

筛选
英文 中文
Layout and Perspective Distortion Independent Recognition of Captured Chinese Document Image 捕获中文文档图像的布局和透视畸变独立识别
Yanwei Wang, Yuefang Sun, Changsong Liu
{"title":"Layout and Perspective Distortion Independent Recognition of Captured Chinese Document Image","authors":"Yanwei Wang, Yuefang Sun, Changsong Liu","doi":"10.1109/ICDAR.2017.102","DOIUrl":"https://doi.org/10.1109/ICDAR.2017.102","url":null,"abstract":"This paper introduced a layout and perspective distortion independent recognition framework for captured Chinese document image. Under the framework, 1) Conditional random field (CRF) is employed for text line extraction from a global point of view. As the text line extraction is layout independent it could be widely used in different type of document images 2) A text line image based perspective distortion correction method is detailed and used in three different ways. 3) The text line extraction and perspective distortion correction are combined with character recognition to construct a recognition system. On three captured document image datasets, the proposed framework improves the accuracies from 94.03% to 95.20%, 13.01% to 93.71% and 10.63% to 92.68% respectively for different distortion degrees. The experimental results demonstrate that the introduced recognition framework is promising for solving layout and perspective distortion problems in captured document image recognition.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125218978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Machine Learning vs Deterministic Rule-Based System for Document Stream Segmentation 机器学习与基于确定性规则的文档流分割系统
Ahmed Hamdi, J. Voerman, Mickaël Coustaty, Aurélie Joseph, V. P. d'Andecy, J. Ogier
{"title":"Machine Learning vs Deterministic Rule-Based System for Document Stream Segmentation","authors":"Ahmed Hamdi, J. Voerman, Mickaël Coustaty, Aurélie Joseph, V. P. d'Andecy, J. Ogier","doi":"10.1109/ICDAR.2017.332","DOIUrl":"https://doi.org/10.1109/ICDAR.2017.332","url":null,"abstract":"Classical document stream Segmentation methods rely on physical separators (white pages, pages with a specific stamp, etc) to automatically split documents from the stream (detecting the beginning and the ending of documents). In order to reduce costly efforts, a recent work using a contextual rulebased approach was proposed to automate this process. Such rules tend to detect continuity, rupture or uncertainty between pairs of pages. Even if these first results were encouraging, performance remained unsatisfactory. In this context, we propose to compare this existing rule-based approach to a machine learningmethod basedon Doc2Vecsoas toevaluate andcompare their strengths and weaknesses. This study was led on a corpus of more than 4,000 real administrative documents composed of more than 8,000 pages. The machine learning approach gives better results on multipage documents while the rule-based method performs best with single page documents.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"BME-26 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121003568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A Case Study of the Relationship between Local Pen Action and Three Dimensional Shapes of Handwritten Strokes 笔的局部动作与手写笔画三维形状关系的个案研究
Yoshinori Akao, Yoshiyasu Higashikawa
{"title":"A Case Study of the Relationship between Local Pen Action and Three Dimensional Shapes of Handwritten Strokes","authors":"Yoshinori Akao, Yoshiyasu Higashikawa","doi":"10.1109/ICDAR.2017.389","DOIUrl":"https://doi.org/10.1109/ICDAR.2017.389","url":null,"abstract":"In this article, we performed a case study of the relationship between local pen action and three dimensional shapes of handwritten strokes on paper sheet. The purpose of the study is to enrich the knowledge effective for forensic handwriting examination. Samples for analysis were one Japanese Hiragana character written by one participant. Online and offline handwritings were captured simultaneously by using ink pen tablet. The type of pen was ballpoint pen, and characters were written on paper for plain copy placed on the tablet. The position and pen pressure information were captured at 200 Hz. The precision of pen position was 0.25 mm, and the pen pressure information was at 15 bit. Experimental results showed that the depth information of overall area of character was related with the density of handwritten strokes. On the other hand, the local shape of handwritten stroke was considered to be related with local pen action. As the pen pressure increased, the depth and the width of handwritten strokes increased. In addition, the influence of pen pressure spread widely around handwritten strokes. However, the local shape was not only dependent on pen pressure but also on pen speed.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128767643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Framework for Document Specific Error Detection and Corrections in Indic OCR 索引OCR中文档特定错误检测和更正的框架
Rohit Saluja, D. Adiga, Ganesh Ramakrishnan, P. Chaudhuri, Mark James Carman
{"title":"A Framework for Document Specific Error Detection and Corrections in Indic OCR","authors":"Rohit Saluja, D. Adiga, Ganesh Ramakrishnan, P. Chaudhuri, Mark James Carman","doi":"10.1109/ICDAR.2017.308","DOIUrl":"https://doi.org/10.1109/ICDAR.2017.308","url":null,"abstract":"In this paper, we present a framework for assisting word-level corrections in Indic OCR documents by incorporating the ability to identify, segment and combine partially correct word forms. The partially correct word forms themselves may be obtained from corrected parts of the document itself and auxiliary sources such as dictionaries and common OCR character confusions. Our framework updates a domain dictionary and learns OCR specific n-gram confusions from the human feedback on the fly. The framework can also leverage consensus between outputs of multiple OCR systems on the same text as an auxiliary source for dynamic dictionary building. Experimental evaluations confirm that for highly inflectional Indian languages, matching partially correct word forms an result in significant reduction in the amount of manual input required for correction. Furthermore, significant gains are observed when the consolidated output of multiple OCR systems is employed as an auxiliary source of information. We have corrected over 1100 pages (13 books) in Sanskrit, 190 pages (1 book) in Marathi, 50 pages (part of a book) in Hindi and 1000 pages (12 books) in English using our framework. We present a book-wise analysis of improvement in required human interaction for these Languages.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128779813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Qumran Letter Restoration by Rotation and Reflection Modified PixelCNN 基于旋转和反射的Qumran字母复原
L. Uzan, N. Dershowitz, Lior Wolf
{"title":"Qumran Letter Restoration by Rotation and Reflection Modified PixelCNN","authors":"L. Uzan, N. Dershowitz, Lior Wolf","doi":"10.1109/ICDAR.2017.14","DOIUrl":"https://doi.org/10.1109/ICDAR.2017.14","url":null,"abstract":"The task of restoring fragmentary letters is fundamental to the reading of ancient manuscripts. We present a method to complete broken letters in the Dead Sea Scrolls, which is based on PixelCNN++. Since the generation of the broken letters is conditioned on the extant scroll, we modify the original method to allow reconstructions in multiple directions. Results on both simulated data and real scrolls demonstrate the advantage of our method over the baseline. The implementation may be found at https://github.com/ghostcow/pixel-cnn-qumran.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132839415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Comprehensive Survey on Handwriting and Computerized Graphology 手写体与电脑化笔迹学的综合调查
Afnan H. Garoot, Maedeh Safar, C. Suen
{"title":"A Comprehensive Survey on Handwriting and Computerized Graphology","authors":"Afnan H. Garoot, Maedeh Safar, C. Suen","doi":"10.1109/ICDAR.2017.107","DOIUrl":"https://doi.org/10.1109/ICDAR.2017.107","url":null,"abstract":"Graphology is a technique used to assess the writer's personality traits from his/her handwriting features. Manual feature extraction and analysis is a time consuming and labor intensive task. Therefore, computerized graphology systems have been developed by researchers to overcome these issues. In this paper, we present the latest state-of-the-art on computerized graphology systems.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114918510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Benchmarking Keypoint Filtering Approaches for Document Image Matching 文档图像匹配的基准点过滤方法
Emilien Royer, J. Chazalon, Marçal Rusiñol, F. Bouchara
{"title":"Benchmarking Keypoint Filtering Approaches for Document Image Matching","authors":"Emilien Royer, J. Chazalon, Marçal Rusiñol, F. Bouchara","doi":"10.1109/ICDAR.2017.64","DOIUrl":"https://doi.org/10.1109/ICDAR.2017.64","url":null,"abstract":"Reducing the amount of keypoints used to index an image is particularly interesting to control processing time and memory usage in real-time document image matching applications, like augmented documents or smartphone applications. This paper benchmarks two keypoint selection methods on a task consisting of reducing keypoint sets extracted from document images, while preserving detection and segmentation accuracy. We first study the different forms of keypoint filtering, and we introduce the use of the CORE selection method on keypoints extracted from document images. Then, we extend a previously published benchmark by including evaluations of the new method, by adding the SURF-BRISK detection/description scheme, and by reporting processing speeds. Evaluations are conducted on the publicly available dataset of ICDAR2015 SmartDOC challenge 1. Finally, we prove that reducing the original keypoint set is always feasible and can be beneficial not only to processing speed but also to accuracy.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"7 19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130648777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Online Handwritten Mongolian Word Recognition Using a Novel Sliding Window Method with Recurrent Neural Networks 基于递归神经网络滑动窗口的在线手写体蒙古语单词识别
Ji Liu, Long-Long Ma, Jian Wu
{"title":"Online Handwritten Mongolian Word Recognition Using a Novel Sliding Window Method with Recurrent Neural Networks","authors":"Ji Liu, Long-Long Ma, Jian Wu","doi":"10.1109/ICDAR.2017.39","DOIUrl":"https://doi.org/10.1109/ICDAR.2017.39","url":null,"abstract":"Because of the conglutinated characteristic of Mongolian words, it's difficult to realize online handwritten Mongolian word recognition with high recognition accuracy based on segmentation-based strategy. Meanwhile, as the vocabulary of Mongolian words is large, using a segmentation-free method with deep bidirectional long short term memory(DBLSTM) network is more suitable. We design a 5 bidirectional hidden level DBLSTM network for online handwritten Mongolian word recognition. This paper mainly proposes a novel sliding window method which selects frames with different intervals to enhance recognition rate. The novel method can generate hundreds of sequence data for each sample, while only one sequence data is generated using ordinary sliding window method. More sequence data and more abundant sequence information are helpful to raise the recognition rate. We evaluated the recognition performance on our online handwritten Mongolian database with 925 classes. The proposed method achieves the word level recognition rate of 89.24% with PCA feature extractor and best path decoding, compared to that of 88.45% using ordinary sliding window method. Further, several well trained DBLSTM models based on the proposed method are combined to vote the output, finally, the word-level recognition raises to 90.35%.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"7 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116739859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Learning Structural Loss Parameters on Graph Embedding Applied on Symbolic Graphs 符号图嵌入中结构损失参数的学习
H. Jarraya, O. R. Terrades, J. Lladós
{"title":"Learning Structural Loss Parameters on Graph Embedding Applied on Symbolic Graphs","authors":"H. Jarraya, O. R. Terrades, J. Lladós","doi":"10.1109/ICDAR.2017.268","DOIUrl":"https://doi.org/10.1109/ICDAR.2017.268","url":null,"abstract":"We propose an amelioration of proposed Graph Embedding (GEM) method in previous work that takes advantages of structural pattern representation and the structured distortion. it models an Attributed Graph (AG) as a Probabilistic Graphical Model (PGM). Then, it learns the parameters of this PGM presented by a vector, as new signature of AG in a lower dimensional vectorial space. We focus to adapt the structured learning algorithm via 1_slack formulation with a suitable risk function, called Graph Edit Distance (GED). It defines the dissimilarity of the ground truth and predicted graph labels. It determines by the error tolerant graph matching using bipartite graph matching algorithm. We apply Structured Support Vector Machines (SSVM) to process classification task. During our experiments, we got our results on the GREC dataset.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116896602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Landscape or Portrait? The Impact of Page Orientation on the Understandability of Scientific Posters 横向还是纵向?页向对科学海报可理解性的影响
Marc Beck, Seyyed Saleh Mozaffari Chanijani, S. S. Bukhari, A. Dengel
{"title":"Landscape or Portrait? The Impact of Page Orientation on the Understandability of Scientific Posters","authors":"Marc Beck, Seyyed Saleh Mozaffari Chanijani, S. S. Bukhari, A. Dengel","doi":"10.1109/ICDAR.2017.376","DOIUrl":"https://doi.org/10.1109/ICDAR.2017.376","url":null,"abstract":"The recent developments in the eye tracking technology lead to new insights in how humans read, yet little is known about how the layout affects the comprehension. In this study, the differences in the understandability and the reading behaviour of two different page orientations (portrait and landscape) of a scientific poster are investigated. An eye tracking experiment was designed to find out whether the participants focus more on different areas in different orientations and whether the orientation has any effect on the reading behaviour or the overall comprehension of the poster. The participants' gazes were recorded and mapped onto the document using homographies. The saccade and transitional analysis over 30 participants concludes that the portrait orientation is better for remembering specific details while the landscape orientation supplements a high level understanding.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"08 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131375759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信