Simple and effective techniques for core-region detection and slant correction in offline script recognition

2009 IEEE International Conference on Signal and Image Processing Applications Pub Date : 2009-12-01 DOI:10.1109/ICSIPA.2009.5478628

A. Rehman, Dzulkifli Mohammad, G. Sulong, T. Saba

{"title":"Simple and effective techniques for core-region detection and slant correction in offline script recognition","authors":"A. Rehman, Dzulkifli Mohammad, G. Sulong, T. Saba","doi":"10.1109/ICSIPA.2009.5478628","DOIUrl":null,"url":null,"abstract":"This paper presents two new preprocessing techniques for cursive script recognition. Enhanced algorithms for core-region detection and effective uniform slant angle estimation are proposed. Reference lines composed of core-region are usually obtained as the ones surrounding highest density peaks, but are strongly affected by the presence of long horizontal strokes and erratic characters in the word. Therefore, it caused confusion with the actual core-region and leads to decisive errors in normalizing the word. To overcome this problem in core-region detection quantile is introduced to make resulting process robust. On the other hand, research community has introduced computationally heavy approaches to remove slant in cursive script. Therefore, a simple formalized and effective method is presented for the detection and removal of slant angle for offline cursive handwritten words to avoid heavy experimental efforts. Additionally, already not-slanted words are not affected negatively by applying this algorithm. The core-region detection is based on statistical features, while slant angle estimation is based on structure features of the word image. The algorithms are tested on IAM benchmark database of cursive handwritten words. Promising results for core-region detection, slant angle estimation/removal are reported and compared with widely applied Bozinovic and Srihari method (BSM).","PeriodicalId":400165,"journal":{"name":"2009 IEEE International Conference on Signal and Image Processing Applications","volume":"103 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Conference on Signal and Image Processing Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSIPA.2009.5478628","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 39

Abstract

This paper presents two new preprocessing techniques for cursive script recognition. Enhanced algorithms for core-region detection and effective uniform slant angle estimation are proposed. Reference lines composed of core-region are usually obtained as the ones surrounding highest density peaks, but are strongly affected by the presence of long horizontal strokes and erratic characters in the word. Therefore, it caused confusion with the actual core-region and leads to decisive errors in normalizing the word. To overcome this problem in core-region detection quantile is introduced to make resulting process robust. On the other hand, research community has introduced computationally heavy approaches to remove slant in cursive script. Therefore, a simple formalized and effective method is presented for the detection and removal of slant angle for offline cursive handwritten words to avoid heavy experimental efforts. Additionally, already not-slanted words are not affected negatively by applying this algorithm. The core-region detection is based on statistical features, while slant angle estimation is based on structure features of the word image. The algorithms are tested on IAM benchmark database of cursive handwritten words. Promising results for core-region detection, slant angle estimation/removal are reported and compared with widely applied Bozinovic and Srihari method (BSM).

查看原文本刊更多论文

脱机文字识别中简单有效的核心区域检测和倾斜校正技术

提出了两种新的草书识别预处理技术。提出了改进的核心区域检测算法和有效的均匀斜角估计算法。由核心区组成的参考线通常是在密度最高峰周围的参考线，但受单词中存在的长水平笔划和不稳定字符的强烈影响。因此，它造成了与实际核心区域的混淆，并导致了单词规范化的决定性错误。为了克服这一问题，在核心区域检测中引入了分位数来提高结果的鲁棒性。另一方面，研究界已经引入了计算量很大的方法来消除草书中的斜体。因此，本文提出了一种简单、形式化、有效的离线草书手写体斜角检测与去除方法，避免了大量的实验工作。此外，已经没有倾斜的单词不会受到应用该算法的负面影响。核心区域检测是基于统计特征，而斜角估计是基于词图像的结构特征。在IAM草书手写基准数据库上对算法进行了测试。本文报道了在岩心区域检测、斜角估计/去除等方面的有希望的结果，并与广泛应用的Bozinovic和Srihari方法(BSM)进行了比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 IEEE International Conference on Signal and Image Processing Applications

自引率

0.00%

发文量