Anatomy of a hand-filled form reader

Proceedings of 1994 IEEE Workshop on Applications of Computer Vision Pub Date : 1994-12-05 DOI:10.1109/ACV.1994.341309

A. K. Chhabra

{"title":"Anatomy of a hand-filled form reader","authors":"A. K. Chhabra","doi":"10.1109/ACV.1994.341309","DOIUrl":null,"url":null,"abstract":"We describe a prototype generic form reader (GFR) system for reading hand-filled forms. The system can read run-on or touching handprinted characters. A one-time form specification is required for each type of form that the system is expected to read. The form specification includes geometric location of registration marks and fields of interest, field grammars, and system parameters. The GFR begins by detecting registration marks, computing image skew, extracting deskewed fields, and computing connected components in the field images. Next, the connected components are split into segments using heuristics about good splitting points. The system is liberal in splitting, i.e., a split segment could be a part of a character or a complete character, and hopefully no more than a character. Next, the segments are adaptively regrouped into 'seg-groups' with the aid of a dynamic programming algorithm that matches the character answers for the seg-groups with the field grammar specification. The single character recognizer (SCR) uses high order combinations of raw geometric features derived from segments and seg-groups. The high order combining rules are derived by statistical discriminant analysis of raw features. The GFR system provides some generic tools that can be applied to other document image analysis problems besides forms reading.<<ETX>>","PeriodicalId":437089,"journal":{"name":"Proceedings of 1994 IEEE Workshop on Applications of Computer Vision","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 1994 IEEE Workshop on Applications of Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACV.1994.341309","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

We describe a prototype generic form reader (GFR) system for reading hand-filled forms. The system can read run-on or touching handprinted characters. A one-time form specification is required for each type of form that the system is expected to read. The form specification includes geometric location of registration marks and fields of interest, field grammars, and system parameters. The GFR begins by detecting registration marks, computing image skew, extracting deskewed fields, and computing connected components in the field images. Next, the connected components are split into segments using heuristics about good splitting points. The system is liberal in splitting, i.e., a split segment could be a part of a character or a complete character, and hopefully no more than a character. Next, the segments are adaptively regrouped into 'seg-groups' with the aid of a dynamic programming algorithm that matches the character answers for the seg-groups with the field grammar specification. The single character recognizer (SCR) uses high order combinations of raw geometric features derived from segments and seg-groups. The high order combining rules are derived by statistical discriminant analysis of raw features. The GFR system provides some generic tools that can be applied to other document image analysis problems besides forms reading.<>

查看原文本刊更多论文

手工填表阅读器的解剖

我们描述了一个原型通用表单阅读器(GFR)系统，用于读取手工填写的表单。该系统可以读取运行或触摸手写字符。对于系统期望读取的每种类型的表单，都需要一次性的表单规范。表单规范包括注册标记和感兴趣字段的几何位置、字段语法和系统参数。GFR首先检测配准标记，计算图像倾斜，提取倾斜场，计算场图像中的连接分量。接下来，使用启发式方法将连接的组件分成分段。系统在分割上是自由的，也就是说，一个分割的片段可以是一个字符的一部分，也可以是一个完整的字符，希望不超过一个字符。接下来，在动态规划算法的帮助下，将片段自适应地重新分组为“分段组”，该算法将分段组的字符答案与字段语法规范相匹配。单字符识别器(SCR)使用来自段和段组的原始几何特征的高阶组合。通过对原始特征的统计判别分析，推导出高阶组合规则。GFR系统提供了一些通用的工具，可以应用于除表单读取之外的其他文档图像分析问题

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of 1994 IEEE Workshop on Applications of Computer Vision

自引率

0.00%

发文量