一种巴厘岛棕榈叶手稿自动标引系统方案

IF 0.5 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS
M. W. A. Kesiman, G. Pradnyana
{"title":"一种巴厘岛棕榈叶手稿自动标引系统方案","authors":"M. W. A. Kesiman, G. Pradnyana","doi":"10.5614/itbj.ict.res.appl.2021.15.2.1","DOIUrl":null,"url":null,"abstract":"This paper proposes an initial scheme towards the development of an automatic word indexation system for Balinese lontar (palm leaf manuscript) collections. The word indexation system scheme consists of a sub module for patch image extraction of text areas in lontars and a sub module for word image transliteration. This is the first word indexation system for lontar collections to be proposed. To detect parts of a lontar image that contain text, a Gabor filter is used to provide initial information about the presence of text texture in the image. An adaptive sliding patch algorithm for the extraction of patch images in lontars is also proposed. The word image transliteration sub module was built using the long short-term memory (LSTM) model. The results showed that the image patch extraction of text areas process succeeded in optimally detecting text areas in lontars and extracting the patch image in a suitable position. The proposed scheme successfully extracted between 20% to 40% of the keywords in lontars and thus can at least provide an initial description for prospective lontar readers of the content contained in a lontar collection or to find in which lontar collection certain keywords can be found.","PeriodicalId":42785,"journal":{"name":"Journal of ICT Research and Applications","volume":" ","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2021-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Scheme Towards Automatic Word Indexation System for Balinese Palm Leaf Manuscripts\",\"authors\":\"M. W. A. Kesiman, G. Pradnyana\",\"doi\":\"10.5614/itbj.ict.res.appl.2021.15.2.1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes an initial scheme towards the development of an automatic word indexation system for Balinese lontar (palm leaf manuscript) collections. The word indexation system scheme consists of a sub module for patch image extraction of text areas in lontars and a sub module for word image transliteration. This is the first word indexation system for lontar collections to be proposed. To detect parts of a lontar image that contain text, a Gabor filter is used to provide initial information about the presence of text texture in the image. An adaptive sliding patch algorithm for the extraction of patch images in lontars is also proposed. The word image transliteration sub module was built using the long short-term memory (LSTM) model. The results showed that the image patch extraction of text areas process succeeded in optimally detecting text areas in lontars and extracting the patch image in a suitable position. The proposed scheme successfully extracted between 20% to 40% of the keywords in lontars and thus can at least provide an initial description for prospective lontar readers of the content contained in a lontar collection or to find in which lontar collection certain keywords can be found.\",\"PeriodicalId\":42785,\"journal\":{\"name\":\"Journal of ICT Research and Applications\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2021-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of ICT Research and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5614/itbj.ict.res.appl.2021.15.2.1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of ICT Research and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5614/itbj.ict.res.appl.2021.15.2.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一个巴厘文棕榈叶手稿自动词标引系统的初步方案。字词索引系统方案由字词区域的补丁图像提取子模块和字词图像音译子模块组成。这是第一个为lontar collection提出的词索引系统。为了检测lontar图像中包含文本的部分,使用Gabor过滤器来提供关于图像中文本纹理存在的初始信息。提出了一种自适应滑动patch算法,用于提取lontars中的patch图像。采用长短期记忆(LSTM)模型构建单词图像转写子模块。结果表明,文本区域的图像patch提取过程能够最优地检测出lontars中的文本区域,并在合适的位置提取出patch图像。所提出的方案成功地提取了lontar集合中20%至40%的关键字,因此至少可以为lontar集合中包含的内容的潜在lontar读者提供初始描述,或查找在lontar集合中可以找到某些关键字。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Scheme Towards Automatic Word Indexation System for Balinese Palm Leaf Manuscripts
This paper proposes an initial scheme towards the development of an automatic word indexation system for Balinese lontar (palm leaf manuscript) collections. The word indexation system scheme consists of a sub module for patch image extraction of text areas in lontars and a sub module for word image transliteration. This is the first word indexation system for lontar collections to be proposed. To detect parts of a lontar image that contain text, a Gabor filter is used to provide initial information about the presence of text texture in the image. An adaptive sliding patch algorithm for the extraction of patch images in lontars is also proposed. The word image transliteration sub module was built using the long short-term memory (LSTM) model. The results showed that the image patch extraction of text areas process succeeded in optimally detecting text areas in lontars and extracting the patch image in a suitable position. The proposed scheme successfully extracted between 20% to 40% of the keywords in lontars and thus can at least provide an initial description for prospective lontar readers of the content contained in a lontar collection or to find in which lontar collection certain keywords can be found.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of ICT Research and Applications
Journal of ICT Research and Applications COMPUTER SCIENCE, INFORMATION SYSTEMS-
CiteScore
1.60
自引率
0.00%
发文量
13
审稿时长
24 weeks
期刊介绍: Journal of ICT Research and Applications welcomes full research articles in the area of Information and Communication Technology from the following subject areas: Information Theory, Signal Processing, Electronics, Computer Network, Telecommunication, Wireless & Mobile Computing, Internet Technology, Multimedia, Software Engineering, Computer Science, Information System and Knowledge Management. Authors are invited to submit articles that have not been published previously and are not under consideration elsewhere.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信