Combination of statistic and structural approach to scripts segmentation from line segmentation of Javanese manuscript image

Q1 Arts and Humanities
Anastasia Rita Widiarti, Marsono, A. Harjoko, S. Hartati
{"title":"Combination of statistic and structural approach to scripts segmentation from line segmentation of Javanese manuscript image","authors":"Anastasia Rita Widiarti, Marsono, A. Harjoko, S. Hartati","doi":"10.1109/DigitalHeritage.2013.6743844","DOIUrl":null,"url":null,"abstract":"The character segmentation of handwritten manuscripts often presents complicated tasks. There are many factors that cause such segmentation difficult, such as inconsistencies in the slope, slant, length and width of each character, as well as intersections of two characters from either the same or different lines. This paper proposes a new approach that combines statistical and structural analyses to generate the Javanese scripts from line segmentation of Javanese manuscript image. Every time a new manuscript is discovered, all objects that make up the characters in the manuscript are identified using interconnecting operation to identify the components of the script. Each object that is interconnected is given the same label. The next task is to calculate the average height and average width of each object that has been given the same label and its standard deviation. This information is used to guide the average normality of a script, i.e. when a character has a height or width that exceeds the average value plus the standard deviation, it can be concluded that the character in question in fact consists of two characters that touch each other. In regard to normalizing a skewed cluster of scripts, the task is to straighten the script in such a way that it becomes perpendicular. The experiment was done using 13 line images from different authors with different writing styles, and the result shows an 88.19% segmentation accuracy. It can be concluded that the proposed approach to segmentation method is relatively a success when applied on the Javanese handwritten characters.","PeriodicalId":52934,"journal":{"name":"Studies in Digital Heritage","volume":"32 1","pages":"775"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in Digital Heritage","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DigitalHeritage.2013.6743844","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 1

Abstract

The character segmentation of handwritten manuscripts often presents complicated tasks. There are many factors that cause such segmentation difficult, such as inconsistencies in the slope, slant, length and width of each character, as well as intersections of two characters from either the same or different lines. This paper proposes a new approach that combines statistical and structural analyses to generate the Javanese scripts from line segmentation of Javanese manuscript image. Every time a new manuscript is discovered, all objects that make up the characters in the manuscript are identified using interconnecting operation to identify the components of the script. Each object that is interconnected is given the same label. The next task is to calculate the average height and average width of each object that has been given the same label and its standard deviation. This information is used to guide the average normality of a script, i.e. when a character has a height or width that exceeds the average value plus the standard deviation, it can be concluded that the character in question in fact consists of two characters that touch each other. In regard to normalizing a skewed cluster of scripts, the task is to straighten the script in such a way that it becomes perpendicular. The experiment was done using 13 line images from different authors with different writing styles, and the result shows an 88.19% segmentation accuracy. It can be concluded that the proposed approach to segmentation method is relatively a success when applied on the Javanese handwritten characters.
基于统计与结构相结合的爪哇文字文本分割方法
手写稿的字符分割往往是一项复杂的任务。造成这种分割困难的因素有很多,比如每个字符的斜度、倾斜度、长度和宽度不一致,以及来自同一或不同行的两个字符的相交。本文提出了一种结合统计分析和结构分析的方法,利用爪哇文字手稿图像的线分割生成爪哇文字。每次发现新的手稿时,都要使用互连操作来识别手稿中构成字符的所有对象,以识别脚本的组成部分。每个相互连接的对象都被赋予相同的标签。下一个任务是计算给定相同标签的每个对象的平均高度和平均宽度及其标准差。该信息用于指导文字的平均正态性,即当一个字符的高度或宽度超过平均值加上标准差时,可以得出结论,该字符实际上是由两个相互接触的字符组成的。关于规范化一组倾斜的脚本,任务是将脚本以一种垂直的方式拉直。实验使用了13张不同作者、不同写作风格的线条图,分割准确率达到了88.19%。实验结果表明,该方法在爪哇语手写体汉字分割上取得了较好的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Studies in Digital Heritage
Studies in Digital Heritage Arts and Humanities-Classics
CiteScore
2.80
自引率
0.00%
发文量
2
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信