使用深度学习技术识别文档中的输入字段

Revista Gestão Inovação e Tecnologias Pub Date : 2021-08-12 DOI:10.47059/revistageintec.v11i4.2468

Atharv Nagarikar, Rahul Singh Dangi, Samrit Kumar Maity, Ashish Kuvelkar, Sanjay Wandhekar

{"title":"使用深度学习技术识别文档中的输入字段","authors":"Atharv Nagarikar, Rahul Singh Dangi, Samrit Kumar Maity, Ashish Kuvelkar, Sanjay Wandhekar","doi":"10.47059/revistageintec.v11i4.2468","DOIUrl":null,"url":null,"abstract":"Identification of input fields that appear on a document is a crucial requirement while digitizing any document. This paper presents a Deep Learning based approach to detect input fields from a form or document which consists of text, images and input fields like textbox, checkbox. The forms have been crawled and labelled manually to generate a dataset for training Deep Learning models. The YOLO V3 model is trained on the labelled dataset having four classes (static text, static image, input text, checkbox) with 1500 instances. We used bounding box techniques to label the dataset. The paper presents detection of limited types of input fields generally appearing on printed forms. We also discussed how such detection models can scale and sustain higher loads. If given the labelled dataset for other types of input fields, the existing YOLO V3 can be trained for them as well. The model is trained for 3500 iterations and the accuracy achieved is 71 percent.","PeriodicalId":428303,"journal":{"name":"Revista Gestão Inovação e Tecnologias","volume":"133 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Input Fields Recognition in Documents Using Deep Learning Techniques\",\"authors\":\"Atharv Nagarikar, Rahul Singh Dangi, Samrit Kumar Maity, Ashish Kuvelkar, Sanjay Wandhekar\",\"doi\":\"10.47059/revistageintec.v11i4.2468\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Identification of input fields that appear on a document is a crucial requirement while digitizing any document. This paper presents a Deep Learning based approach to detect input fields from a form or document which consists of text, images and input fields like textbox, checkbox. The forms have been crawled and labelled manually to generate a dataset for training Deep Learning models. The YOLO V3 model is trained on the labelled dataset having four classes (static text, static image, input text, checkbox) with 1500 instances. We used bounding box techniques to label the dataset. The paper presents detection of limited types of input fields generally appearing on printed forms. We also discussed how such detection models can scale and sustain higher loads. If given the labelled dataset for other types of input fields, the existing YOLO V3 can be trained for them as well. The model is trained for 3500 iterations and the accuracy achieved is 71 percent.\",\"PeriodicalId\":428303,\"journal\":{\"name\":\"Revista Gestão Inovação e Tecnologias\",\"volume\":\"133 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Revista Gestão Inovação e Tecnologias\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.47059/revistageintec.v11i4.2468\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista Gestão Inovação e Tecnologias","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.47059/revistageintec.v11i4.2468","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在对任何文档进行数字化时，识别文档上出现的输入字段是一个至关重要的要求。本文提出了一种基于深度学习的方法来检测由文本、图像和输入字段(如文本框、复选框)组成的表单或文档中的输入字段。这些表格已经被手动抓取和标记，以生成用于训练深度学习模型的数据集。YOLO V3模型在有标签的数据集上进行训练，该数据集有四个类(静态文本、静态图像、输入文本、复选框)，有1500个实例。我们使用边界框技术来标记数据集。本文介绍了打印表单中常见的有限类型输入字段的检测。我们还讨论了这样的检测模型如何扩展和维持更高的负载。如果给定其他类型输入字段的标记数据集，现有的YOLO V3也可以针对它们进行训练。该模型经过3500次迭代训练，准确率达到71%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Input Fields Recognition in Documents Using Deep Learning Techniques

Identification of input fields that appear on a document is a crucial requirement while digitizing any document. This paper presents a Deep Learning based approach to detect input fields from a form or document which consists of text, images and input fields like textbox, checkbox. The forms have been crawled and labelled manually to generate a dataset for training Deep Learning models. The YOLO V3 model is trained on the labelled dataset having four classes (static text, static image, input text, checkbox) with 1500 instances. We used bounding box techniques to label the dataset. The paper presents detection of limited types of input fields generally appearing on printed forms. We also discussed how such detection models can scale and sustain higher loads. If given the labelled dataset for other types of input fields, the existing YOLO V3 can be trained for them as well. The model is trained for 3500 iterations and the accuracy achieved is 71 percent.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Revista Gestão Inovação e Tecnologias

自引率

0.00%

发文量