Skeleton extraction: Comparison of five methods on the Arabic IFN/ENIT database

Atallah Al-Shatnawi, K. Omar, Bader M. AlFawwaz, A. Zeki
{"title":"Skeleton extraction: Comparison of five methods on the Arabic IFN/ENIT database","authors":"Atallah Al-Shatnawi, K. Omar, Bader M. AlFawwaz, A. Zeki","doi":"10.1109/CSIT.2014.6805978","DOIUrl":null,"url":null,"abstract":"Thinning “Skeletonization” is a very crucial stage in the Arabic Character Recognition (ACR) system. It simplifies the text shape and reduces the amount of data that needs to be handled and it is usually used as a pre-processing stage for recognition and storage systems. The skeleton of Arabic text can be used for: baseline detection, character segmentation, and features extraction, and ultimately supporting the classification. In this paper, five of the state of the art thinning algorithms are selected and implemented. The five algorithms are: SPTA, Zhang-Suen parallel thinning algorithm, Huang-Wan-Liu thinning algorithm, thinning and skeletonization based morphological operation algorithms. The five selected algorithms are applied on the IFN/ENIT dataset. The results obtained by the five methods are discussed and analyzed against the IFN/ENIT dataset based on preserving shape and the text connectivity, preventing spurious tails, maintaining one pixel width skeleton and avoiding the necking problem as well as running time efficiently. In addition to that some performance measurement for checking text connectivity, spurious tails and calculating the stroke thickness are proposed and carried out.","PeriodicalId":278806,"journal":{"name":"2014 6th International Conference on Computer Science and Information Technology (CSIT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 6th International Conference on Computer Science and Information Technology (CSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSIT.2014.6805978","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Thinning “Skeletonization” is a very crucial stage in the Arabic Character Recognition (ACR) system. It simplifies the text shape and reduces the amount of data that needs to be handled and it is usually used as a pre-processing stage for recognition and storage systems. The skeleton of Arabic text can be used for: baseline detection, character segmentation, and features extraction, and ultimately supporting the classification. In this paper, five of the state of the art thinning algorithms are selected and implemented. The five algorithms are: SPTA, Zhang-Suen parallel thinning algorithm, Huang-Wan-Liu thinning algorithm, thinning and skeletonization based morphological operation algorithms. The five selected algorithms are applied on the IFN/ENIT dataset. The results obtained by the five methods are discussed and analyzed against the IFN/ENIT dataset based on preserving shape and the text connectivity, preventing spurious tails, maintaining one pixel width skeleton and avoiding the necking problem as well as running time efficiently. In addition to that some performance measurement for checking text connectivity, spurious tails and calculating the stroke thickness are proposed and carried out.
骨骼提取:五种方法在阿拉伯语IFN/ENIT数据库上的比较
细化“骨架化”是阿拉伯文字符识别(ACR)系统中一个非常关键的阶段。它简化了文本形状,减少了需要处理的数据量,通常用作识别和存储系统的预处理阶段。阿拉伯语文本的骨架可用于:基线检测、字符分割、特征提取,并最终支持分类。本文选择并实现了目前最先进的五种稀疏算法。这五种算法分别是:SPTA、Zhang-Suen并行稀疏算法、Huang-Wan-Liu稀疏算法、基于稀疏和骨架化的形态学运算算法。将选择的五种算法应用于IFN/ENIT数据集。针对IFN/ENIT数据集,从保持形状和文本连通性、防止伪尾、保持1像素宽度骨架、有效避免颈缩问题和有效缩短运行时间等方面对五种方法的结果进行了讨论和分析。此外,还提出并实现了文本连通性检查、伪尾检查和笔画厚度计算等性能测试。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信