{"title":"Intelligent Pen: A Least Cost Search Approach to Stroke Extraction in Historical Documents","authors":"Kevin L. Bauer, W. Barrett","doi":"10.2352/ISSN.2470-1173.2016.17.DRR-057","DOIUrl":null,"url":null,"abstract":"Intelligent Pen: A Least Cost Search Approach to Stroke Extraction in Historical Documents Kevin L. Bauer Department of Computer Science, BYU Master of Science Extracting strokes from handwriting in historical documents provides high-level features for the challenging problem of handwriting recognition. Such handwriting often contains noise, faint or incomplete strokes, strokes with gaps, overlapping ascenders and descenders and competing lines when embedded in a table or form, making it unsuitable for local line following algorithms or associated binarization schemes. We introduce Intelligent Pen for piece-wise optimal stroke extraction. Extracted strokes are stitched together to provide a complete trace of the handwriting. Intelligent Pen formulates stroke extraction as a set of piece-wise optimal paths, extracted and assembled in cost order. As such, Intelligent Pen is robust to noise, gaps, faint handwriting and even competing lines and strokes. Intelligent Pen traces compare closely with the shape as well as the order in which the handwriting was written. A quantitative comparison with an ICDAR handwritten stroke data set shows Intelligent Pen traces to be within 0.78 pixels (mean difference) of the manually created strokes.","PeriodicalId":152377,"journal":{"name":"Document Recognition and Retrieval","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Document Recognition and Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2352/ISSN.2470-1173.2016.17.DRR-057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Intelligent Pen: A Least Cost Search Approach to Stroke Extraction in Historical Documents Kevin L. Bauer Department of Computer Science, BYU Master of Science Extracting strokes from handwriting in historical documents provides high-level features for the challenging problem of handwriting recognition. Such handwriting often contains noise, faint or incomplete strokes, strokes with gaps, overlapping ascenders and descenders and competing lines when embedded in a table or form, making it unsuitable for local line following algorithms or associated binarization schemes. We introduce Intelligent Pen for piece-wise optimal stroke extraction. Extracted strokes are stitched together to provide a complete trace of the handwriting. Intelligent Pen formulates stroke extraction as a set of piece-wise optimal paths, extracted and assembled in cost order. As such, Intelligent Pen is robust to noise, gaps, faint handwriting and even competing lines and strokes. Intelligent Pen traces compare closely with the shape as well as the order in which the handwriting was written. A quantitative comparison with an ICDAR handwritten stroke data set shows Intelligent Pen traces to be within 0.78 pixels (mean difference) of the manually created strokes.
杨百翰大学计算机科学系理学硕士Kevin L. Bauer从历史文档中提取笔画为具有挑战性的手写识别问题提供了高级特征。当嵌入到表格或表格中时,这种笔迹通常包含噪声、模糊或不完整的笔画、有缝隙的笔画、重叠的上升和下降线以及竞争的线条,使其不适合局部线条跟踪算法或相关的二值化方案。我们引入智能笔,以实现逐块的最佳笔画提取。提取的笔画被缝合在一起,以提供完整的笔迹痕迹。Intelligent Pen将笔画提取作为一组分段最优路径,按成本顺序提取和组装。因此,智能笔对噪音、缝隙、模糊的笔迹,甚至是相互竞争的线条和笔画都很强大。智能笔的痕迹与书写的形状和顺序非常接近。与ICDAR手写笔画数据集的定量比较显示,智能笔痕迹与手动创建笔画的平均差值在0.78像素以内。