DAR '12最新文献 - Book学术

Bangla date field extraction in offline handwritten documents 脱机手写文档中的孟加拉语日期字段提取

DAR '12 Pub Date : 2012-12-16 DOI: 10.1145/2432553.2432561

Ranju Mandal, P. Roy, U. Pal

引用次数: 1

A data acquisition and analysis system for palm leaf documents in Telugu 泰卢固语棕榈叶文献数据采集与分析系统

DAR '12 Pub Date : 2012-12-16 DOI: 10.1145/2432553.2432578

P. N. Sastry, R. Krishnan

引用次数: 5

Benchmarking recognition results on camera captured word image data sets 对相机捕获的文字图像数据集的识别结果进行基准测试

DAR '12 Pub Date : 2012-12-16 DOI: 10.1145/2432553.2432572

D. Kumar, M. Prasad, A. Ramakrishnan

{"title":"Benchmarking recognition results on camera captured word image data sets","authors":"D. Kumar, M. Prasad, A. Ramakrishnan","doi":"10.1145/2432553.2432572","DOIUrl":"https://doi.org/10.1145/2432553.2432572","url":null,"abstract":"We have benchmarked the maximum obtainable recognition accuracy on five publicly available standard word image data sets using semi-automated segmentation and a commercial OCR. These images have been cropped from camera captured scene images, born digital images (BDI) and street view images. Using the Matlab based tool developed by us, we have annotated at the pixel level more than 3600 word images from the five data sets. The word images binarized by the tool, as well as by our own midline analysis and propagation of segmentation (MAPS) algorithm are recognized using the trial version of Nuance Omnipage OCR and these two results are compared with the best reported in the literature. The benchmark word recognition rates obtained on ICDAR 2003, Sign evaluation, Street view, Born-digital and ICDAR 2011 data sets are 83.9%, 89.3%, 79.6%, 88.5% and 86.7%, respectively. The results obtained from MAPS binarized word images without the use of any lexicon are 64.5% and 71.7% for ICDAR 2003 and 2011 respectively, and these values are higher than the best reported values in the literature of 61.1% and 41.2%, respectively. MAPS results of 82.8% for BDI 2011 dataset matches the performance of the state of the art method based on power law transform.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116629334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Assamese online handwritten digit recognition system using hidden Markov models 使用隐马尔可夫模型的阿萨姆在线手写数字识别系统

DAR '12 Pub Date : 2012-12-16 DOI: 10.1145/2432553.2432573

G. S. Reddy, Bandita Sarma, R. Naik, S. Prasanna, C. Mahanta

引用次数: 24

Offline handwritten word recognition in Hindi 脱机手写词识别在印地语

DAR '12 Pub Date : 2012-12-16 DOI: 10.1145/2432553.2432563

R. Sitaram, Shrang Jain, Hariharan Ravishankar

引用次数: 14

Development of an Assamese OCR using Bangla OCR 使用孟加拉语OCR开发阿萨姆语OCR

DAR '12 Pub Date : 2012-12-16 DOI: 10.1145/2432553.2432566

Subhankar Ghosh, P. Bora, Sanjib Das, B. Chaudhuri

引用次数: 10

Line segmentation of handwritten Gurmukhi manuscripts Gurmukhi手写体手稿的线段分割

DAR '12 Pub Date : 2012-12-16 DOI: 10.1145/2432553.2432568

S. Jindal, Gurpreet Singh Lehal

引用次数: 13

A syntactic PR approach to Telugu handwritten character recognition 泰卢固语手写字符识别的句法PR方法

DAR '12 Pub Date : 2012-12-16 DOI: 10.1145/2432553.2432579

Samita Pradhan, A. Negi

{"title":"A syntactic PR approach to Telugu handwritten character recognition","authors":"Samita Pradhan, A. Negi","doi":"10.1145/2432553.2432579","DOIUrl":"https://doi.org/10.1145/2432553.2432579","url":null,"abstract":"This paper shows a character recognition mechanism based on a syntactic PR approach that uses the trie data structure for efficient recognition. It uses approximate matching of the string for classification. During the preprocessing an input character image is transformed into a skeletonized image and discrete curves are found using a 3 x 3 pixel region. A trie, which we call as a sequence trie is used for a look up approach at a lower level to encode a discrete curve pattern of pixels. The sequence of such discrete curves from the input pattern is looked up in the sequence trie. The encoding of several such sequence numbers for the thinned character constructs a pattern string. Approximate string matching is used to compare the encoded pattern string from a template character with the pattern string obtained from the input character. We consider the approximate matching of the string instead of the exact matching to make the approach robust in the presence of noise. Another trie data structure (called pattern trie) is used for the efficient storage and retrieval for approximate matching of the string. We make use of the trie since it takes O(m) in worst case where m is the length of the longest string in the trie. For the approximate string matching we use look ahead with a branch and bound scheme in the trie. Here we apply our method on 43 Telugu characters from the basic Telugu characters for demonstration. The proposed approach has recognised all the test characters given here correctly, however more extensive testing on realistic data is required.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129226446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

An empirical intrinsic mode based characterization of Indian scripts 基于经验内在模式的印度文字表征

DAR '12 Pub Date : 2012-12-16 DOI: 10.1145/2432553.2432575

Kavita Bhardwaj, S. Chaudhury, Sumantra Dutta Roy

引用次数: 1

Recognition of Kannada characters extracted from scene images 从场景图像中提取卡纳达语字符的识别

DAR '12 Pub Date : 2012-12-16 DOI: 10.1145/2432553.2432557

D. Kumar, A. Ramakrishnan

引用次数: 10