Table Structure Recognition Based on Grid Shape Graph

Eunji Lee, Junhyeong Kwon, Haeyoon Yang, Jaewoo Park, Soonyoung Lee, H. Koo, N. Cho
{"title":"Table Structure Recognition Based on Grid Shape Graph","authors":"Eunji Lee, Junhyeong Kwon, Haeyoon Yang, Jaewoo Park, Soonyoung Lee, H. Koo, N. Cho","doi":"10.23919/APSIPAASC55919.2022.9980172","DOIUrl":null,"url":null,"abstract":"Since tables in documents provide important information in compact form, table understanding has been an essential topic in document image processing. Researchers represented table structures in various formats for table understanding, such as simple grid structure, a graph with text/cell boxes as nodes, or a sequence of HTML tokens. However, these approaches have difficulties in handling regularities, e.g., global row and column information, and spanning cells simultaneously. In this paper, we propose a new table recognition method based on a grid shape graph and present grid localization and grid elements grouping networks. This approach is designed to exploit the grid structure and deal with spanning cells. To convert grid structure into cell structure, we only have to test adjacent pairs of grid elements, enabling efficient inference. In addition, we have discovered that predicting row/column-based relationships between grid elements improve cell-based connectivity estimation performance. We demonstrate the effectiveness of the proposed method through experiments on three benchmark datasets.","PeriodicalId":382967,"journal":{"name":"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/APSIPAASC55919.2022.9980172","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Since tables in documents provide important information in compact form, table understanding has been an essential topic in document image processing. Researchers represented table structures in various formats for table understanding, such as simple grid structure, a graph with text/cell boxes as nodes, or a sequence of HTML tokens. However, these approaches have difficulties in handling regularities, e.g., global row and column information, and spanning cells simultaneously. In this paper, we propose a new table recognition method based on a grid shape graph and present grid localization and grid elements grouping networks. This approach is designed to exploit the grid structure and deal with spanning cells. To convert grid structure into cell structure, we only have to test adjacent pairs of grid elements, enabling efficient inference. In addition, we have discovered that predicting row/column-based relationships between grid elements improve cell-based connectivity estimation performance. We demonstrate the effectiveness of the proposed method through experiments on three benchmark datasets.
基于网格形状图的表结构识别
由于文档中的表以紧凑的形式提供重要信息,因此表理解一直是文档图像处理中的一个重要主题。研究人员用不同的格式来表示表结构,以便理解表,比如简单的网格结构,用文本/单元格框作为节点的图形,或者一系列HTML标记。然而,这些方法在处理规则性方面存在困难,例如,全局行和列信息,以及同时跨越单元格。本文提出了一种基于网格形状图的表格识别方法,并提出了网格定位和网格元素分组网络。该方法旨在利用网格结构并处理生成单元。为了将网格结构转换为单元结构,我们只需要测试相邻的网格元素对,从而实现有效的推理。此外,我们发现预测网格元素之间基于行/列的关系可以提高基于单元格的连接估计性能。我们通过三个基准数据集的实验证明了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信