{"title":"Recognising Devanagari Script by Deep Structure Learning of Image Quadrants","authors":"Seba Susan, J. Malhotra","doi":"10.14429/djlit.40.05.16336","DOIUrl":null,"url":null,"abstract":"\n \n \nAncient Indic languages were written in the Devanagari script from which most of the modern-day Indic writing systems have evolved. The digitisation of ancient Devanagari manuscripts, now archived in national museums, is a part of the language documentation and digital archiving initiative of the Government of India. The challenge in digitizing these handwritten scripts is the lack of adequate datasets for training machine learning models. In our work, we focus on the Devanagari script that has 46 categories of characters that makes training a difficult task, especially when the number of samples are few. We propose deep structure learning of image quadrants, based on learning the hidden state activations derived from convolutional neural networks that are trained separately on five image quadrants. The second phase of our learning module comprises of a deep neural network that learns the hidden state activations of the five convolutional neural networks, fused by concatenation. The experiments prove that the proposed deep structure learning outperforms the state of the art. \n \n \n","PeriodicalId":44921,"journal":{"name":"DESIDOC Journal of Library & Information Technology","volume":" ","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2020-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"DESIDOC Journal of Library & Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14429/djlit.40.05.16336","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 9
Abstract
Ancient Indic languages were written in the Devanagari script from which most of the modern-day Indic writing systems have evolved. The digitisation of ancient Devanagari manuscripts, now archived in national museums, is a part of the language documentation and digital archiving initiative of the Government of India. The challenge in digitizing these handwritten scripts is the lack of adequate datasets for training machine learning models. In our work, we focus on the Devanagari script that has 46 categories of characters that makes training a difficult task, especially when the number of samples are few. We propose deep structure learning of image quadrants, based on learning the hidden state activations derived from convolutional neural networks that are trained separately on five image quadrants. The second phase of our learning module comprises of a deep neural network that learns the hidden state activations of the five convolutional neural networks, fused by concatenation. The experiments prove that the proposed deep structure learning outperforms the state of the art.
期刊介绍:
DESIDOC Journal of Library & Information Technology publishes original research and review papers related to library science and IT applied to library activities, services, and products. Major subject fields covered include: Information systems, Knowledge management, Collection building & management, Information behaviour & retrieval, Librarianship/library management, Library & information services, Records management & preservation, etc.