泰卢固语手写字符识别的句法PR方法

DAR '12 Pub Date : 2012-12-16 DOI:10.1145/2432553.2432579

Samita Pradhan, A. Negi

{"title":"泰卢固语手写字符识别的句法PR方法","authors":"Samita Pradhan, A. Negi","doi":"10.1145/2432553.2432579","DOIUrl":null,"url":null,"abstract":"This paper shows a character recognition mechanism based on a syntactic PR approach that uses the trie data structure for efficient recognition. It uses approximate matching of the string for classification. During the preprocessing an input character image is transformed into a skeletonized image and discrete curves are found using a 3 x 3 pixel region. A trie, which we call as a sequence trie is used for a look up approach at a lower level to encode a discrete curve pattern of pixels. The sequence of such discrete curves from the input pattern is looked up in the sequence trie. The encoding of several such sequence numbers for the thinned character constructs a pattern string. Approximate string matching is used to compare the encoded pattern string from a template character with the pattern string obtained from the input character. We consider the approximate matching of the string instead of the exact matching to make the approach robust in the presence of noise. Another trie data structure (called pattern trie) is used for the efficient storage and retrieval for approximate matching of the string. We make use of the trie since it takes O(m) in worst case where m is the length of the longest string in the trie. For the approximate string matching we use look ahead with a branch and bound scheme in the trie. Here we apply our method on 43 Telugu characters from the basic Telugu characters for demonstration. The proposed approach has recognised all the test characters given here correctly, however more extensive testing on realistic data is required.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"A syntactic PR approach to Telugu handwritten character recognition\",\"authors\":\"Samita Pradhan, A. Negi\",\"doi\":\"10.1145/2432553.2432579\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper shows a character recognition mechanism based on a syntactic PR approach that uses the trie data structure for efficient recognition. It uses approximate matching of the string for classification. During the preprocessing an input character image is transformed into a skeletonized image and discrete curves are found using a 3 x 3 pixel region. A trie, which we call as a sequence trie is used for a look up approach at a lower level to encode a discrete curve pattern of pixels. The sequence of such discrete curves from the input pattern is looked up in the sequence trie. The encoding of several such sequence numbers for the thinned character constructs a pattern string. Approximate string matching is used to compare the encoded pattern string from a template character with the pattern string obtained from the input character. We consider the approximate matching of the string instead of the exact matching to make the approach robust in the presence of noise. Another trie data structure (called pattern trie) is used for the efficient storage and retrieval for approximate matching of the string. We make use of the trie since it takes O(m) in worst case where m is the length of the longest string in the trie. For the approximate string matching we use look ahead with a branch and bound scheme in the trie. Here we apply our method on 43 Telugu characters from the basic Telugu characters for demonstration. The proposed approach has recognised all the test characters given here correctly, however more extensive testing on realistic data is required.\",\"PeriodicalId\":410986,\"journal\":{\"name\":\"DAR '12\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"DAR '12\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2432553.2432579\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"DAR '12","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2432553.2432579","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

摘要

本文提出了一种基于句法PR方法的字符识别机制，该机制使用trie数据结构进行高效识别。它使用字符串的近似匹配进行分类。在预处理过程中，将输入字符图像转换为骨架图像，并使用3 x 3像素区域找到离散曲线。我们称之为序列三阶树的三阶树用于较低层次的查找方法，以编码像素的离散曲线模式。在序列树中查找来自输入模式的这些离散曲线的序列。对几个这样的字符序列号进行编码，就构成了一个模式字符串。近似字符串匹配用于比较来自模板字符的编码模式字符串与从输入字符获得的模式字符串。我们考虑字符串的近似匹配而不是精确匹配，以使该方法在存在噪声的情况下具有鲁棒性。另一种trie数据结构(称为模式trie)用于有效地存储和检索字符串的近似匹配。我们利用这个树，因为在最坏的情况下，它需要O(m)，其中m是树中最长字符串的长度。对于近似的字符串匹配，我们在tree中使用分支定界模式的forward。在此，我们从泰卢固语的基本字符中选取43个泰卢固语字符来进行验证。所提出的方法已经正确地识别了这里给出的所有测试特征，但是需要对实际数据进行更广泛的测试。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A syntactic PR approach to Telugu handwritten character recognition

This paper shows a character recognition mechanism based on a syntactic PR approach that uses the trie data structure for efficient recognition. It uses approximate matching of the string for classification. During the preprocessing an input character image is transformed into a skeletonized image and discrete curves are found using a 3 x 3 pixel region. A trie, which we call as a sequence trie is used for a look up approach at a lower level to encode a discrete curve pattern of pixels. The sequence of such discrete curves from the input pattern is looked up in the sequence trie. The encoding of several such sequence numbers for the thinned character constructs a pattern string. Approximate string matching is used to compare the encoded pattern string from a template character with the pattern string obtained from the input character. We consider the approximate matching of the string instead of the exact matching to make the approach robust in the presence of noise. Another trie data structure (called pattern trie) is used for the efficient storage and retrieval for approximate matching of the string. We make use of the trie since it takes O(m) in worst case where m is the length of the longest string in the trie. For the approximate string matching we use look ahead with a branch and bound scheme in the trie. Here we apply our method on 43 Telugu characters from the basic Telugu characters for demonstration. The proposed approach has recognised all the test characters given here correctly, however more extensive testing on realistic data is required.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

DAR '12

自引率

0.00%

发文量