RCANet: A Rows and Columns Aggregated Network for Table Structure Recognition

Xinyi Shen, Lingjun Kong, Yunchao Bao, Yaowei Zhou, Weiguang Liu
{"title":"RCANet: A Rows and Columns Aggregated Network for Table Structure Recognition","authors":"Xinyi Shen, Lingjun Kong, Yunchao Bao, Yaowei Zhou, Weiguang Liu","doi":"10.1109/ictc55111.2022.9778621","DOIUrl":null,"url":null,"abstract":"Most existing table structure recognition methods can be classified into two major categories: detecting table borders methods and detecting table rows and columns methods. The method of detecting the table borders can produce the imbalance between positive and negative samples, because the number of pixels in the table borders is very small. Although the method of detecting the rows and columns of the table avoids this imbalance, some studies simplify the prediction of rows and columns into column-by-column and row-by-row prediction, which creates a problem with large error tolerance. To solve this problem, two modules are proposed, called Rows Aggregated (RA) module and Columns Aggregated (CA) module. Firstly, the method of feature slicing and tiling is used to make approximate prediction for the rows and columns that solves the problem of large error tolerance. Secondly, the row and column information is further retrieved by calculating the attention maps of channels. Finally, we use RA and CA to build a semantic segmentation network, which is called Rows and Columns Aggregated Network (RCANet), to complete the rows segmentation and columns segmentation. We generate rows and columns masks on ICDAR2013 dataset and evaluate the model. Experiments show that the proposed model has better performance than the segmentation model based on detection table rows and columns method, and its average precision, recall and F1 value are 2.08%, 3.21% and 2.45% higher respectively.","PeriodicalId":123022,"journal":{"name":"2022 3rd Information Communication Technologies Conference (ICTC)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 3rd Information Communication Technologies Conference (ICTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ictc55111.2022.9778621","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Most existing table structure recognition methods can be classified into two major categories: detecting table borders methods and detecting table rows and columns methods. The method of detecting the table borders can produce the imbalance between positive and negative samples, because the number of pixels in the table borders is very small. Although the method of detecting the rows and columns of the table avoids this imbalance, some studies simplify the prediction of rows and columns into column-by-column and row-by-row prediction, which creates a problem with large error tolerance. To solve this problem, two modules are proposed, called Rows Aggregated (RA) module and Columns Aggregated (CA) module. Firstly, the method of feature slicing and tiling is used to make approximate prediction for the rows and columns that solves the problem of large error tolerance. Secondly, the row and column information is further retrieved by calculating the attention maps of channels. Finally, we use RA and CA to build a semantic segmentation network, which is called Rows and Columns Aggregated Network (RCANet), to complete the rows segmentation and columns segmentation. We generate rows and columns masks on ICDAR2013 dataset and evaluate the model. Experiments show that the proposed model has better performance than the segmentation model based on detection table rows and columns method, and its average precision, recall and F1 value are 2.08%, 3.21% and 2.45% higher respectively.
RCANet:一个用于表结构识别的行和列聚合网络
现有的表结构识别方法主要分为两大类:表边界检测方法和表行、列检测方法。检测表边界的方法会产生正负样本之间的不平衡,因为表边界的像素数量非常小。虽然检测表的行和列的方法避免了这种不平衡,但一些研究将行和列的预测简化为逐列和逐行预测,这就产生了容错性很大的问题。为了解决这一问题,提出了两个模块:RA (Rows Aggregated)模块和CA (Columns Aggregated)模块。首先,采用特征切片和平铺的方法对行和列进行近似预测,解决了容错性大的问题;其次,通过计算通道的注意图,进一步提取行信息和列信息;最后,我们利用RA和CA构建了一个语义分割网络,称为行与列聚合网络(Rows and Columns Aggregated network, RCANet),完成行分割和列分割。我们在ICDAR2013数据集上生成行和列掩码,并对模型进行评估。实验表明,该模型比基于检测表行列法的分割模型性能更好,平均准确率、召回率和F1值分别提高了2.08%、3.21%和2.45%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信