石刻文字识别的模式匹配模型

IF 1.5 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computer Journal Pub Date : 2021-10-01 DOI:10.1093/comjnl/bxab177

K Durga Devi;P Uma Maheswari;Phani Kumar Polasi;R Preetha;M Vidhyalakshmi

{"title":"石刻文字识别的模式匹配模型","authors":"K Durga Devi;P Uma Maheswari;Phani Kumar Polasi;R Preetha;M Vidhyalakshmi","doi":"10.1093/comjnl/bxab177","DOIUrl":null,"url":null,"abstract":"As there are countless significant works done for handwritten character recognition, very meager effort has been reported for inscription characters especially for Tamil stone inscriptions. The real challenge faced in handling stone inscription is dataset collection and foreground and background discrimination. Till present days, the archeological department follows traditional way of capturing, preserving and deciphering stone inscriptions which is manual, more time consuming and need expert assistance. Hence digitized recognition is essential and efficient pattern matching algorithm is needed to be developed to deal with variations in shape and size of complex structured characters present in Tamil stone inscriptions. In this paper, an automated character recognition by pattern matching approach is developed, where character features were extracted by using pattern matching algorithm that helps achieving good recognition rate. Recognition of ancient Tamil stone inscriptions characters and finding their corresponding contemporary Tamil character is done by Image-based Character Pattern Identification (ICPI) system. Modified Speeded Up Robust Feature with Bag of Grapheme (MSURF-BoG) algorithm is implemented to detect the strongest key points from the input character with different orientations. These key point features were created for training the image as a model called Bag of Grapheme (BoG) with code word creation. Hence unsupervised key point features were extracted and pattern matching is performed. 11\n<sup>th</sup>\n century Tamil stone inscriptions were taken as samples which has 7 vowels and 17 consonants, totally 24 characters were used. Here samples with different orientation from each 24 character were used for training the system. The proposed system is evaluated by recognition accuracy which is reported for character wise at the maximum of 96%.","PeriodicalId":50641,"journal":{"name":"Computer Journal","volume":"66 3","pages":"554-564"},"PeriodicalIF":1.5000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Pattern Matching Model for Recognition of Stone Inscription Characters\",\"authors\":\"K Durga Devi;P Uma Maheswari;Phani Kumar Polasi;R Preetha;M Vidhyalakshmi\",\"doi\":\"10.1093/comjnl/bxab177\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As there are countless significant works done for handwritten character recognition, very meager effort has been reported for inscription characters especially for Tamil stone inscriptions. The real challenge faced in handling stone inscription is dataset collection and foreground and background discrimination. Till present days, the archeological department follows traditional way of capturing, preserving and deciphering stone inscriptions which is manual, more time consuming and need expert assistance. Hence digitized recognition is essential and efficient pattern matching algorithm is needed to be developed to deal with variations in shape and size of complex structured characters present in Tamil stone inscriptions. In this paper, an automated character recognition by pattern matching approach is developed, where character features were extracted by using pattern matching algorithm that helps achieving good recognition rate. Recognition of ancient Tamil stone inscriptions characters and finding their corresponding contemporary Tamil character is done by Image-based Character Pattern Identification (ICPI) system. Modified Speeded Up Robust Feature with Bag of Grapheme (MSURF-BoG) algorithm is implemented to detect the strongest key points from the input character with different orientations. These key point features were created for training the image as a model called Bag of Grapheme (BoG) with code word creation. Hence unsupervised key point features were extracted and pattern matching is performed. 11\\n<sup>th</sup>\\n century Tamil stone inscriptions were taken as samples which has 7 vowels and 17 consonants, totally 24 characters were used. Here samples with different orientation from each 24 character were used for training the system. The proposed system is evaluated by recognition accuracy which is reported for character wise at the maximum of 96%.\",\"PeriodicalId\":50641,\"journal\":{\"name\":\"Computer Journal\",\"volume\":\"66 3\",\"pages\":\"554-564\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2021-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10084359/\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10084359/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 5

摘要

由于有无数重要的工作，手写的字符识别，非常微薄的努力已被报道的铭文字符，特别是泰米尔石刻铭文。石刻处理面临的真正挑战是数据集的收集和前景背景的区分。迄今为止，考古部门仍采用传统的采集、保存和破译石刻的方法，这种方法既费时又费力，还需要专家的协助。因此，数字化识别是必要的，需要开发有效的模式匹配算法来处理泰米尔石刻中存在的形状和大小变化的复杂结构字符。本文提出了一种基于模式匹配的字符自动识别方法，利用模式匹配算法提取字符特征，从而达到较好的识别率。采用基于图像的字符模式识别(ICPI)系统对古代泰米尔石刻文字进行识别，并找到与之对应的当代泰米尔文字。采用改进的加速稳健特征与Grapheme Bag (MSURF-BoG)算法，从不同方向的输入字符中检测出最强的关键点。创建这些关键点特征是为了将图像训练为一个称为Grapheme Bag (BoG)的模型，并使用码字创建。在此基础上提取无监督的关键点特征并进行模式匹配。以11世纪泰米尔石刻碑文为例，有7个元音和17个辅音，共使用了24个字。在这里，使用来自每24个字符的不同方向的样本来训练系统。该系统的识别准确率最高可达96%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Pattern Matching Model for Recognition of Stone Inscription Characters

As there are countless significant works done for handwritten character recognition, very meager effort has been reported for inscription characters especially for Tamil stone inscriptions. The real challenge faced in handling stone inscription is dataset collection and foreground and background discrimination. Till present days, the archeological department follows traditional way of capturing, preserving and deciphering stone inscriptions which is manual, more time consuming and need expert assistance. Hence digitized recognition is essential and efficient pattern matching algorithm is needed to be developed to deal with variations in shape and size of complex structured characters present in Tamil stone inscriptions. In this paper, an automated character recognition by pattern matching approach is developed, where character features were extracted by using pattern matching algorithm that helps achieving good recognition rate. Recognition of ancient Tamil stone inscriptions characters and finding their corresponding contemporary Tamil character is done by Image-based Character Pattern Identification (ICPI) system. Modified Speeded Up Robust Feature with Bag of Grapheme (MSURF-BoG) algorithm is implemented to detect the strongest key points from the input character with different orientations. These key point features were created for training the image as a model called Bag of Grapheme (BoG) with code word creation. Hence unsupervised key point features were extracted and pattern matching is performed. 11 ^th century Tamil stone inscriptions were taken as samples which has 7 vowels and 17 consonants, totally 24 characters were used. Here samples with different orientation from each 24 character were used for training the system. The proposed system is evaluated by recognition accuracy which is reported for character wise at the maximum of 96%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Journal 工程技术-计算机：软件工程

CiteScore

3.60

自引率

7.10%

发文量

164

审稿时长

4.8 months

期刊介绍： The Computer Journal is one of the longest-established journals serving all branches of the academic computer science community. It is currently published in four sections.