Arabic hand-written text-line extraction

Abderrazak Zahour, B. Taconet, P. Mercy, Said Ramdane
{"title":"Arabic hand-written text-line extraction","authors":"Abderrazak Zahour, B. Taconet, P. Mercy, Said Ramdane","doi":"10.1109/ICDAR.2001.953799","DOIUrl":null,"url":null,"abstract":"This paper describes a text-line extraction based method. The typical segmentation for a printed binary document is based on the horizontal projection analysis and then the regrouping of the connected components. These techniques can't be used for handwritten unconstrained text because data frequently contain undulations and shifts in the baseline, baseline-skew variability and inter-line distance variability. So, we think that the border line for a handwritten unconstrained documents should be a collection of horizontal line segments. From this point of view, we use a partial contour following based method to detect the separating lines. In the current version of our algorithm, we proceed to text slant detection, text line number evaluation by using partial projection. Then we carry out a partial contour following of every line; first in the direction of the writing, then in the opposite direction. After the treatment, the adjacent lines are separated. In the experimental session, we describe the application of the algorithm used for the extraction of text line. Database images contains about one hundred handwritten Arabic texts written by different writers. Results about diacritical points affectation are also reported.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"127","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of Sixth International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2001.953799","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 127

Abstract

This paper describes a text-line extraction based method. The typical segmentation for a printed binary document is based on the horizontal projection analysis and then the regrouping of the connected components. These techniques can't be used for handwritten unconstrained text because data frequently contain undulations and shifts in the baseline, baseline-skew variability and inter-line distance variability. So, we think that the border line for a handwritten unconstrained documents should be a collection of horizontal line segments. From this point of view, we use a partial contour following based method to detect the separating lines. In the current version of our algorithm, we proceed to text slant detection, text line number evaluation by using partial projection. Then we carry out a partial contour following of every line; first in the direction of the writing, then in the opposite direction. After the treatment, the adjacent lines are separated. In the experimental session, we describe the application of the algorithm used for the extraction of text line. Database images contains about one hundred handwritten Arabic texts written by different writers. Results about diacritical points affectation are also reported.
阿拉伯语手写文本行提取
本文描述了一种基于文本行提取的方法。典型的打印二进制文档分割方法是基于水平投影分析,然后将连接的组件重新分组。这些技术不能用于手写的无约束文本,因为数据经常包含基线的波动和移动、基线倾斜可变性和线间距离可变性。因此,我们认为手写的无约束文档的边界线应该是水平线段的集合。从这个角度来看,我们使用基于部分轮廓跟踪的方法来检测分离线。在当前版本的算法中,我们继续使用部分投影进行文本倾斜检测,文本行数评估。然后对每条线进行局部轮廓跟踪;首先顺着书写的方向,然后顺着相反的方向。处理后,相邻的线被分开。在实验部分,我们描述了该算法在文本行提取中的应用。数据库图像包含大约100个由不同作者手写的阿拉伯文本。本文还报道了变音符点影响的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信