使用代码本和图分区学习文本-线分割

2012 International Conference on Frontiers in Handwriting Recognition Pub Date : 2012-09-18 DOI:10.1109/ICFHR.2012.228

Le Kang, J. Kumar, Peng Ye, D. Doermann

{"title":"使用代码本和图分区学习文本-线分割","authors":"Le Kang, J. Kumar, Peng Ye, D. Doermann","doi":"10.1109/ICFHR.2012.228","DOIUrl":null,"url":null,"abstract":"In this paper, we present a codebook based method for handwritten text-line segmentation which uses image-patches in the training data to learn a graph-based similarity for clustering. We first construct a codebook of image-patches using K-medoids, and obtain exemplars which encode local evidence. We then obtain the corresponding codewords for all patches extracted from a given image and construct a similarity graph using the learned evidence and partitioned to obtain text-lines. Our learning based approach performs well on a field dataset containing degraded and un-constrained handwritten Arabic document images. Results on ICDAR 2009 segmentation contest dataset show that the method is competitive with previous approaches.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Learning Text-Line Segmentation Using Codebooks and Graph Partitioning\",\"authors\":\"Le Kang, J. Kumar, Peng Ye, D. Doermann\",\"doi\":\"10.1109/ICFHR.2012.228\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a codebook based method for handwritten text-line segmentation which uses image-patches in the training data to learn a graph-based similarity for clustering. We first construct a codebook of image-patches using K-medoids, and obtain exemplars which encode local evidence. We then obtain the corresponding codewords for all patches extracted from a given image and construct a similarity graph using the learned evidence and partitioned to obtain text-lines. Our learning based approach performs well on a field dataset containing degraded and un-constrained handwritten Arabic document images. Results on ICDAR 2009 segmentation contest dataset show that the method is competitive with previous approaches.\",\"PeriodicalId\":291062,\"journal\":{\"name\":\"2012 International Conference on Frontiers in Handwriting Recognition\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 International Conference on Frontiers in Handwriting Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICFHR.2012.228\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 International Conference on Frontiers in Handwriting Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFHR.2012.228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

在本文中，我们提出了一种基于码本的手写文本行分割方法，该方法使用训练数据中的图像补丁来学习基于图的相似度进行聚类。我们首先利用k -媒质构造图像补丁的码本，得到编码局部证据的样本。然后，我们获得从给定图像中提取的所有补丁对应的码字，并使用学习到的证据构造相似图并进行分割以获得文本行。我们基于学习的方法在包含退化和无约束手写阿拉伯文档图像的现场数据集上表现良好。在ICDAR 2009分割大赛数据集上的实验结果表明，该方法具有较好的竞争力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning Text-Line Segmentation Using Codebooks and Graph Partitioning

In this paper, we present a codebook based method for handwritten text-line segmentation which uses image-patches in the training data to learn a graph-based similarity for clustering. We first construct a codebook of image-patches using K-medoids, and obtain exemplars which encode local evidence. We then obtain the corresponding codewords for all patches extracted from a given image and construct a similarity graph using the learned evidence and partitioned to obtain text-lines. Our learning based approach performs well on a field dataset containing degraded and un-constrained handwritten Arabic document images. Results on ICDAR 2009 segmentation contest dataset show that the method is competitive with previous approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 International Conference on Frontiers in Handwriting Recognition

自引率

0.00%

发文量