阅读手写的美国人口普查表格

S. Madhvanath, V. Govindaraju, V. Ramanaprasad, Dar-Shyang Lee, S. Srihari
{"title":"阅读手写的美国人口普查表格","authors":"S. Madhvanath, V. Govindaraju, V. Ramanaprasad, Dar-Shyang Lee, S. Srihari","doi":"10.1109/ICDAR.1995.598949","DOIUrl":null,"url":null,"abstract":"Commercial forms-reading systems for extraction of data from forms do not meet acceptable accuracy requirements on forms filled out by hand. In December 1993, NIST called industry and research organizations working in the area of handwriting recognition to participate in a test to determine the state of the art in the area. A database of form images containing actual responses received by the US Census Bureau was provided. The handwritten responses are very loosely constrained in terms of writing style, format of response and choice of text. The sizes of the lexicons provided are very large (about 50000 entries) and yet the coverage is incomplete (about 70%). In this paper we discuss the approach taken by CEDAR to automate the task of reading the census forms. The subtasks of field extraction and phrase recognition are described.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Reading handwritten US census forms\",\"authors\":\"S. Madhvanath, V. Govindaraju, V. Ramanaprasad, Dar-Shyang Lee, S. Srihari\",\"doi\":\"10.1109/ICDAR.1995.598949\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Commercial forms-reading systems for extraction of data from forms do not meet acceptable accuracy requirements on forms filled out by hand. In December 1993, NIST called industry and research organizations working in the area of handwriting recognition to participate in a test to determine the state of the art in the area. A database of form images containing actual responses received by the US Census Bureau was provided. The handwritten responses are very loosely constrained in terms of writing style, format of response and choice of text. The sizes of the lexicons provided are very large (about 50000 entries) and yet the coverage is incomplete (about 70%). In this paper we discuss the approach taken by CEDAR to automate the task of reading the census forms. The subtasks of field extraction and phrase recognition are described.\",\"PeriodicalId\":273519,\"journal\":{\"name\":\"Proceedings of 3rd International Conference on Document Analysis and Recognition\",\"volume\":\"70 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 3rd International Conference on Document Analysis and Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.1995.598949\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 3rd International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.1995.598949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23

摘要

用于从表格中提取数据的商业表格读取系统不符合手工填写表格的可接受的准确性要求。1993年12月,NIST召集手写识别领域的工业和研究组织参加一个测试,以确定该领域的技术水平。提供了包含美国人口普查局收到的实际答复的表格图像数据库。手写回复在写作风格、回复格式和文本选择方面的限制非常宽松。所提供的词典的大小非常大(大约50000个条目),但覆盖率不完整(大约70%)。在本文中,我们讨论了雪松采用的方法,使阅读人口普查表格的任务自动化。描述了字段提取和短语识别的子任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Reading handwritten US census forms
Commercial forms-reading systems for extraction of data from forms do not meet acceptable accuracy requirements on forms filled out by hand. In December 1993, NIST called industry and research organizations working in the area of handwriting recognition to participate in a test to determine the state of the art in the area. A database of form images containing actual responses received by the US Census Bureau was provided. The handwritten responses are very loosely constrained in terms of writing style, format of response and choice of text. The sizes of the lexicons provided are very large (about 50000 entries) and yet the coverage is incomplete (about 70%). In this paper we discuss the approach taken by CEDAR to automate the task of reading the census forms. The subtasks of field extraction and phrase recognition are described.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信