Key Information Extraction from Mobile-Captured Vietnamese Receipt Images using Graph Neural Networks Approach

Van Dung Pham, L. Nguyen, Nhat Truong Pham, Bao Hung Nguyen, Due Ngoe Minh Dang, Sy Dzung Nguyen
{"title":"Key Information Extraction from Mobile-Captured Vietnamese Receipt Images using Graph Neural Networks Approach","authors":"Van Dung Pham, L. Nguyen, Nhat Truong Pham, Bao Hung Nguyen, Due Ngoe Minh Dang, Sy Dzung Nguyen","doi":"10.1109/GTSD54989.2022.9989111","DOIUrl":null,"url":null,"abstract":"Information extraction and retrieval are growing fields that have a significant role in document parser and analysis systems. Researches and applications developed in recent years show the numerous difficulties and obstacles in extracting key information from documents. Thanks to the raising of graph theory and deep learning, graph representation and graph learning have been widely applied in information extraction to obtain more exact results. In this paper, we propose a solution upon graph neural networks (GNN) for key information extraction (KIE) that aims to extract the key information from mobile-captured Vietnamese receipt images. Firstly, the images are pre-processed using U2-Net, and then a CRAFT model is used to detect texts from the pre-processed images. Next, the implemented TransformerOCR model is employed for text recognition. Finally, a GNN-based model is designed to extract the key information based on the recognized texts. For validating the effectiveness of the proposed solution, the publicly available dataset released from the Mobile-Captured Receipt Recognition (MC-OCR) Challenge 2021 is used to train and evaluate. The experimental results indicate that our proposed solution achieves a character error rate (CER) score of 0.25 on the private test set, which is more comparable with all reported solutions in the MC-OCR Challenge 2021 as mentioned in the literature. For reproducing and knowledge-sharing purposes, our implementation of the proposed solution is publicly available at https://github.com/ThorPhamlKey_infomation_extraction.","PeriodicalId":125445,"journal":{"name":"2022 6th International Conference on Green Technology and Sustainable Development (GTSD)","volume":"50 12","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 6th International Conference on Green Technology and Sustainable Development (GTSD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GTSD54989.2022.9989111","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Information extraction and retrieval are growing fields that have a significant role in document parser and analysis systems. Researches and applications developed in recent years show the numerous difficulties and obstacles in extracting key information from documents. Thanks to the raising of graph theory and deep learning, graph representation and graph learning have been widely applied in information extraction to obtain more exact results. In this paper, we propose a solution upon graph neural networks (GNN) for key information extraction (KIE) that aims to extract the key information from mobile-captured Vietnamese receipt images. Firstly, the images are pre-processed using U2-Net, and then a CRAFT model is used to detect texts from the pre-processed images. Next, the implemented TransformerOCR model is employed for text recognition. Finally, a GNN-based model is designed to extract the key information based on the recognized texts. For validating the effectiveness of the proposed solution, the publicly available dataset released from the Mobile-Captured Receipt Recognition (MC-OCR) Challenge 2021 is used to train and evaluate. The experimental results indicate that our proposed solution achieves a character error rate (CER) score of 0.25 on the private test set, which is more comparable with all reported solutions in the MC-OCR Challenge 2021 as mentioned in the literature. For reproducing and knowledge-sharing purposes, our implementation of the proposed solution is publicly available at https://github.com/ThorPhamlKey_infomation_extraction.
基于图神经网络的越南移动接收图像关键信息提取
信息提取和检索在文档解析和分析系统中扮演着重要的角色。近年来的研究和应用表明,从文档中提取关键信息存在诸多困难和障碍。由于图论和深度学习的提出,图表示和图学习在信息提取中得到了广泛的应用,以获得更精确的结果。在本文中,我们提出了一种基于图神经网络(GNN)的关键信息提取(KIE)解决方案,旨在从移动捕获的越南收据图像中提取关键信息。首先利用u2net对图像进行预处理,然后利用CRAFT模型对预处理后的图像进行文本检测。然后,将实现的TransformerOCR模型用于文本识别。最后,设计了基于gnn的模型,根据识别文本提取关键信息。为了验证所提出解决方案的有效性,使用了2021年移动捕获收据识别(MC-OCR)挑战发布的公开可用数据集进行训练和评估。实验结果表明,我们提出的解决方案在私有测试集上的字符错误率(CER)得分为0.25,与文献中提到的MC-OCR Challenge 2021中所有报道的解决方案更具可比性。出于复制和知识共享的目的,我们提出的解决方案的实现可在https://github.com/ThorPhamlKey_infomation_extraction上公开获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信