利用距离变换图学习手稿中的自由线检测

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI:10.1109/ICDAR.2019.00044

M. Kassis, Jihad El-Sana

{"title":"利用距离变换图学习手稿中的自由线检测","authors":"M. Kassis, Jihad El-Sana","doi":"10.1109/ICDAR.2019.00044","DOIUrl":null,"url":null,"abstract":"We present a fully automated learning free method, for line detection in manuscripts. We begin by separating components that span over multiple lines, then we remove noise, and small connected components such as diacritics. We apply a distance transform on the image to create the image skeleton. The skeleton is pruned, its vertexes and edges are detected, in order to generate the initial document graph. We calculate the vertex v-score using its t-score and l-score quantifying its distance from being an absolute link in a line. In a greedy manner we classify each edge in the graph either a link, a bridge or a conflict edge. We merge every two edges classified as link together, then we merge the conflict edges next. Finally we remove the bridge edges from the graph generating the final form of the graph. Each edge in the graph equals to one extracted line. We applied the method on the DIVA-hisDB dataset on both public and private sections. The public section participated in the recently conducted Layout Analysis for Challenging Medieval Manuscripts Competition, and we have achieved results surpassing the vast majority of these systems.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Learning Free Line Detection in Manuscripts using Distance Transform Graph\",\"authors\":\"M. Kassis, Jihad El-Sana\",\"doi\":\"10.1109/ICDAR.2019.00044\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a fully automated learning free method, for line detection in manuscripts. We begin by separating components that span over multiple lines, then we remove noise, and small connected components such as diacritics. We apply a distance transform on the image to create the image skeleton. The skeleton is pruned, its vertexes and edges are detected, in order to generate the initial document graph. We calculate the vertex v-score using its t-score and l-score quantifying its distance from being an absolute link in a line. In a greedy manner we classify each edge in the graph either a link, a bridge or a conflict edge. We merge every two edges classified as link together, then we merge the conflict edges next. Finally we remove the bridge edges from the graph generating the final form of the graph. Each edge in the graph equals to one extracted line. We applied the method on the DIVA-hisDB dataset on both public and private sections. The public section participated in the recently conducted Layout Analysis for Challenging Medieval Manuscripts Competition, and we have achieved results surpassing the vast majority of these systems.\",\"PeriodicalId\":325437,\"journal\":{\"name\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2019.00044\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2019.00044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

我们提出了一种完全自动化的免费学习方法，用于手稿中的线条检测。我们首先分离跨多条线的组件，然后去除噪声和小的连接组件，如变音符号。我们在图像上应用距离变换来创建图像骨架。对骨架进行剪枝，检测其顶点和边缘，从而生成初始文档图。我们使用顶点的t-score和l-score来计算顶点的v-score，它们量化了顶点与直线上绝对链接的距离。以贪婪的方式，我们将图中的每条边分类为链接、桥或冲突边。我们先合并每两条被分类为连接的边，然后再合并冲突的边。最后，我们从图中移除桥边，生成图的最终形式。图中的每条边等于一条提取的线。我们在公共和私有部分的DIVA-hisDB数据集上应用了该方法。公共部门参加了最近进行的中世纪手稿挑战布局分析比赛，我们取得了超过绝大多数这些系统的成绩。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning Free Line Detection in Manuscripts using Distance Transform Graph

We present a fully automated learning free method, for line detection in manuscripts. We begin by separating components that span over multiple lines, then we remove noise, and small connected components such as diacritics. We apply a distance transform on the image to create the image skeleton. The skeleton is pruned, its vertexes and edges are detected, in order to generate the initial document graph. We calculate the vertex v-score using its t-score and l-score quantifying its distance from being an absolute link in a line. In a greedy manner we classify each edge in the graph either a link, a bridge or a conflict edge. We merge every two edges classified as link together, then we merge the conflict edges next. Finally we remove the bridge edges from the graph generating the final form of the graph. Each edge in the graph equals to one extracted line. We applied the method on the DIVA-hisDB dataset on both public and private sections. The public section participated in the recently conducted Layout Analysis for Challenging Medieval Manuscripts Competition, and we have achieved results surpassing the vast majority of these systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 International Conference on Document Analysis and Recognition (ICDAR)

自引率

0.00%

发文量