Recognising Devanagari Script by Deep Structure Learning of Image Quadrants

IF 0.6 Q3 INFORMATION SCIENCE & LIBRARY SCIENCE
Seba Susan, J. Malhotra
{"title":"Recognising Devanagari Script by Deep Structure Learning of Image Quadrants","authors":"Seba Susan, J. Malhotra","doi":"10.14429/djlit.40.05.16336","DOIUrl":null,"url":null,"abstract":"\n \n \nAncient Indic languages were written in the Devanagari script from which most of the modern-day Indic writing systems have evolved. The digitisation of ancient Devanagari manuscripts, now archived in national museums, is a part of the language documentation and digital archiving initiative of the Government of India. The challenge in digitizing these handwritten scripts is the lack of adequate datasets for training machine learning models. In our work, we focus on the Devanagari script that has 46 categories of characters that makes training a difficult task, especially when the number of samples are few. We propose deep structure learning of image quadrants, based on learning the hidden state activations derived from convolutional neural networks that are trained separately on five image quadrants. The second phase of our learning module comprises of a deep neural network that learns the hidden state activations of the five convolutional neural networks, fused by concatenation. The experiments prove that the proposed deep structure learning outperforms the state of the art. \n \n \n","PeriodicalId":44921,"journal":{"name":"DESIDOC Journal of Library & Information Technology","volume":" ","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2020-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"DESIDOC Journal of Library & Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14429/djlit.40.05.16336","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 9

Abstract

Ancient Indic languages were written in the Devanagari script from which most of the modern-day Indic writing systems have evolved. The digitisation of ancient Devanagari manuscripts, now archived in national museums, is a part of the language documentation and digital archiving initiative of the Government of India. The challenge in digitizing these handwritten scripts is the lack of adequate datasets for training machine learning models. In our work, we focus on the Devanagari script that has 46 categories of characters that makes training a difficult task, especially when the number of samples are few. We propose deep structure learning of image quadrants, based on learning the hidden state activations derived from convolutional neural networks that are trained separately on five image quadrants. The second phase of our learning module comprises of a deep neural network that learns the hidden state activations of the five convolutional neural networks, fused by concatenation. The experiments prove that the proposed deep structure learning outperforms the state of the art.
利用图像象限的深度结构学习识别天成文书
古代印度语是用梵文书写的,现代印度语的大部分书写系统都是从梵文演变而来的。目前在国家博物馆存档的古代天成文书手稿的数字化是印度政府语言文件和数字存档倡议的一部分。数字化这些手写脚本的挑战是缺乏足够的数据集来训练机器学习模型。在我们的工作中,我们专注于有46类字符的天成文书,这使得训练成为一项困难的任务,尤其是在样本数量很少的情况下。我们提出了图像象限的深度结构学习,基于学习从卷积神经网络导出的隐藏状态激活,卷积神经网络在五个图像象限上单独训练。我们学习模块的第二阶段包括一个深度神经网络,该网络学习通过级联融合的五个卷积神经网络的隐藏状态激活。实验证明,所提出的深度结构学习优于现有技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
DESIDOC Journal of Library & Information Technology
DESIDOC Journal of Library & Information Technology INFORMATION SCIENCE & LIBRARY SCIENCE-
CiteScore
1.70
自引率
11.10%
发文量
32
期刊介绍: DESIDOC Journal of Library & Information Technology publishes original research and review papers related to library science and IT applied to library activities, services, and products. Major subject fields covered include: Information systems, Knowledge management, Collection building & management, Information behaviour & retrieval, Librarianship/library management, Library & information services, Records management & preservation, etc.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信