{"title":"Anatomy of a hand-filled form reader","authors":"A. K. Chhabra","doi":"10.1109/ACV.1994.341309","DOIUrl":null,"url":null,"abstract":"We describe a prototype generic form reader (GFR) system for reading hand-filled forms. The system can read run-on or touching handprinted characters. A one-time form specification is required for each type of form that the system is expected to read. The form specification includes geometric location of registration marks and fields of interest, field grammars, and system parameters. The GFR begins by detecting registration marks, computing image skew, extracting deskewed fields, and computing connected components in the field images. Next, the connected components are split into segments using heuristics about good splitting points. The system is liberal in splitting, i.e., a split segment could be a part of a character or a complete character, and hopefully no more than a character. Next, the segments are adaptively regrouped into 'seg-groups' with the aid of a dynamic programming algorithm that matches the character answers for the seg-groups with the field grammar specification. The single character recognizer (SCR) uses high order combinations of raw geometric features derived from segments and seg-groups. The high order combining rules are derived by statistical discriminant analysis of raw features. The GFR system provides some generic tools that can be applied to other document image analysis problems besides forms reading.<<ETX>>","PeriodicalId":437089,"journal":{"name":"Proceedings of 1994 IEEE Workshop on Applications of Computer Vision","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 1994 IEEE Workshop on Applications of Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACV.1994.341309","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
We describe a prototype generic form reader (GFR) system for reading hand-filled forms. The system can read run-on or touching handprinted characters. A one-time form specification is required for each type of form that the system is expected to read. The form specification includes geometric location of registration marks and fields of interest, field grammars, and system parameters. The GFR begins by detecting registration marks, computing image skew, extracting deskewed fields, and computing connected components in the field images. Next, the connected components are split into segments using heuristics about good splitting points. The system is liberal in splitting, i.e., a split segment could be a part of a character or a complete character, and hopefully no more than a character. Next, the segments are adaptively regrouped into 'seg-groups' with the aid of a dynamic programming algorithm that matches the character answers for the seg-groups with the field grammar specification. The single character recognizer (SCR) uses high order combinations of raw geometric features derived from segments and seg-groups. The high order combining rules are derived by statistical discriminant analysis of raw features. The GFR system provides some generic tools that can be applied to other document image analysis problems besides forms reading.<>