使用深度学习技术识别文档中的输入字段

Atharv Nagarikar, Rahul Singh Dangi, Samrit Kumar Maity, Ashish Kuvelkar, Sanjay Wandhekar
{"title":"使用深度学习技术识别文档中的输入字段","authors":"Atharv Nagarikar, Rahul Singh Dangi, Samrit Kumar Maity, Ashish Kuvelkar, Sanjay Wandhekar","doi":"10.47059/revistageintec.v11i4.2468","DOIUrl":null,"url":null,"abstract":"Identification of input fields that appear on a document is a crucial requirement while digitizing any document. This paper presents a Deep Learning based approach to detect input fields from a form or document which consists of text, images and input fields like textbox, checkbox. The forms have been crawled and labelled manually to generate a dataset for training Deep Learning models. The YOLO V3 model is trained on the labelled dataset having four classes (static text, static image, input text, checkbox) with 1500 instances. We used bounding box techniques to label the dataset. The paper presents detection of limited types of input fields generally appearing on printed forms. We also discussed how such detection models can scale and sustain higher loads. If given the labelled dataset for other types of input fields, the existing YOLO V3 can be trained for them as well. The model is trained for 3500 iterations and the accuracy achieved is 71 percent.","PeriodicalId":428303,"journal":{"name":"Revista Gestão Inovação e Tecnologias","volume":"133 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Input Fields Recognition in Documents Using Deep Learning Techniques\",\"authors\":\"Atharv Nagarikar, Rahul Singh Dangi, Samrit Kumar Maity, Ashish Kuvelkar, Sanjay Wandhekar\",\"doi\":\"10.47059/revistageintec.v11i4.2468\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Identification of input fields that appear on a document is a crucial requirement while digitizing any document. This paper presents a Deep Learning based approach to detect input fields from a form or document which consists of text, images and input fields like textbox, checkbox. The forms have been crawled and labelled manually to generate a dataset for training Deep Learning models. The YOLO V3 model is trained on the labelled dataset having four classes (static text, static image, input text, checkbox) with 1500 instances. We used bounding box techniques to label the dataset. The paper presents detection of limited types of input fields generally appearing on printed forms. We also discussed how such detection models can scale and sustain higher loads. If given the labelled dataset for other types of input fields, the existing YOLO V3 can be trained for them as well. The model is trained for 3500 iterations and the accuracy achieved is 71 percent.\",\"PeriodicalId\":428303,\"journal\":{\"name\":\"Revista Gestão Inovação e Tecnologias\",\"volume\":\"133 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Revista Gestão Inovação e Tecnologias\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.47059/revistageintec.v11i4.2468\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista Gestão Inovação e Tecnologias","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.47059/revistageintec.v11i4.2468","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

在对任何文档进行数字化时,识别文档上出现的输入字段是一个至关重要的要求。本文提出了一种基于深度学习的方法来检测由文本、图像和输入字段(如文本框、复选框)组成的表单或文档中的输入字段。这些表格已经被手动抓取和标记,以生成用于训练深度学习模型的数据集。YOLO V3模型在有标签的数据集上进行训练,该数据集有四个类(静态文本、静态图像、输入文本、复选框),有1500个实例。我们使用边界框技术来标记数据集。本文介绍了打印表单中常见的有限类型输入字段的检测。我们还讨论了这样的检测模型如何扩展和维持更高的负载。如果给定其他类型输入字段的标记数据集,现有的YOLO V3也可以针对它们进行训练。该模型经过3500次迭代训练,准确率达到71%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Input Fields Recognition in Documents Using Deep Learning Techniques
Identification of input fields that appear on a document is a crucial requirement while digitizing any document. This paper presents a Deep Learning based approach to detect input fields from a form or document which consists of text, images and input fields like textbox, checkbox. The forms have been crawled and labelled manually to generate a dataset for training Deep Learning models. The YOLO V3 model is trained on the labelled dataset having four classes (static text, static image, input text, checkbox) with 1500 instances. We used bounding box techniques to label the dataset. The paper presents detection of limited types of input fields generally appearing on printed forms. We also discussed how such detection models can scale and sustain higher loads. If given the labelled dataset for other types of input fields, the existing YOLO V3 can be trained for them as well. The model is trained for 3500 iterations and the accuracy achieved is 71 percent.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信