从完整的手写页面提取键值信息

IEEE International Conference on Document Analysis and Recognition Pub Date : 2023-04-26 DOI:10.48550/arXiv.2304.13530

Solène Tarride, Mélodie Boillet, Christopher Kermorvant

{"title":"从完整的手写页面提取键值信息","authors":"Solène Tarride, Mélodie Boillet, Christopher Kermorvant","doi":"10.48550/arXiv.2304.13530","DOIUrl":null,"url":null,"abstract":"We propose a Transformer-based approach for information extraction from digitized handwritten documents. Our approach combines, in a single model, the different steps that were so far performed by separate models: feature extraction, handwriting recognition and named entity recognition. We compare this integrated approach with traditional two-stage methods that perform handwriting recognition before named entity recognition, and present results at different levels: line, paragraph, and page. Our experiments show that attention-based models are especially interesting when applied on full pages, as they do not require any prior segmentation step. Finally, we show that they are able to learn from key-value annotations: a list of important words with their corresponding named entities. We compare our models to state-of-the-art methods on three public databases (IAM, ESPOSALLES, and POPP) and outperform previous performances on all three datasets.","PeriodicalId":294655,"journal":{"name":"IEEE International Conference on Document Analysis and Recognition","volume":"115 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Key-value information extraction from full handwritten pages\",\"authors\":\"Solène Tarride, Mélodie Boillet, Christopher Kermorvant\",\"doi\":\"10.48550/arXiv.2304.13530\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a Transformer-based approach for information extraction from digitized handwritten documents. Our approach combines, in a single model, the different steps that were so far performed by separate models: feature extraction, handwriting recognition and named entity recognition. We compare this integrated approach with traditional two-stage methods that perform handwriting recognition before named entity recognition, and present results at different levels: line, paragraph, and page. Our experiments show that attention-based models are especially interesting when applied on full pages, as they do not require any prior segmentation step. Finally, we show that they are able to learn from key-value annotations: a list of important words with their corresponding named entities. We compare our models to state-of-the-art methods on three public databases (IAM, ESPOSALLES, and POPP) and outperform previous performances on all three datasets.\",\"PeriodicalId\":294655,\"journal\":{\"name\":\"IEEE International Conference on Document Analysis and Recognition\",\"volume\":\"115 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE International Conference on Document Analysis and Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2304.13530\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2304.13530","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

我们提出了一种基于变压器的方法从数字化手写文档中提取信息。我们的方法在一个模型中结合了迄今为止由不同模型执行的不同步骤:特征提取、手写识别和命名实体识别。我们将这种集成方法与传统的两阶段方法进行比较，传统的两阶段方法在命名实体识别之前执行手写识别，并在不同层次上呈现结果:行、段和页。我们的实验表明，当应用于整页时，基于注意力的模型特别有趣，因为它们不需要任何预先分割步骤。最后，我们展示了它们能够从键值注释中学习:一个具有相应命名实体的重要单词列表。我们将我们的模型与三个公共数据库(IAM, ESPOSALLES和POPP)上最先进的方法进行了比较，并在所有三个数据集上优于以前的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Key-value information extraction from full handwritten pages

We propose a Transformer-based approach for information extraction from digitized handwritten documents. Our approach combines, in a single model, the different steps that were so far performed by separate models: feature extraction, handwriting recognition and named entity recognition. We compare this integrated approach with traditional two-stage methods that perform handwriting recognition before named entity recognition, and present results at different levels: line, paragraph, and page. Our experiments show that attention-based models are especially interesting when applied on full pages, as they do not require any prior segmentation step. Finally, we show that they are able to learn from key-value annotations: a list of important words with their corresponding named entities. We compare our models to state-of-the-art methods on three public databases (IAM, ESPOSALLES, and POPP) and outperform previous performances on all three datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE International Conference on Document Analysis and Recognition

自引率

0.00%

发文量