SpaDen:用于真实世界图表理解的稀疏和密集关键点估计

IEEE International Conference on Document Analysis and Recognition Pub Date : 2023-08-03 DOI:10.48550/arXiv.2308.01971

Saleem Ahmed, Pengyu Yan, D. Doermann, S. Setlur, Venugopal Govindaraju

{"title":"SpaDen:用于真实世界图表理解的稀疏和密集关键点估计","authors":"Saleem Ahmed, Pengyu Yan, D. Doermann, S. Setlur, Venugopal Govindaraju","doi":"10.48550/arXiv.2308.01971","DOIUrl":null,"url":null,"abstract":"We introduce a novel bottom-up approach for the extraction of chart data. Our model utilizes images of charts as inputs and learns to detect keypoints (KP), which are used to reconstruct the components within the plot area. Our novelty lies in detecting a fusion of continuous and discrete KP as predicted heatmaps. A combination of sparse and dense per-pixel objectives coupled with a uni-modal self-attention-based feature-fusion layer is applied to learn KP embeddings. Further leveraging deep metric learning for unsupervised clustering, allows us to segment the chart plot area into various objects. By further matching the chart components to the legend, we are able to obtain the data series names. A post-processing threshold is applied to the KP embeddings to refine the object reconstructions and improve accuracy. Our extensive experiments include an evaluation of different modules for KP estimation and the combination of deep layer aggregation and corner pooling approaches. The results of our experiments provide extensive evaluation for the task of real-world chart data extraction.","PeriodicalId":294655,"journal":{"name":"IEEE International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SpaDen : Sparse and Dense Keypoint Estimation for Real-World Chart Understanding\",\"authors\":\"Saleem Ahmed, Pengyu Yan, D. Doermann, S. Setlur, Venugopal Govindaraju\",\"doi\":\"10.48550/arXiv.2308.01971\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce a novel bottom-up approach for the extraction of chart data. Our model utilizes images of charts as inputs and learns to detect keypoints (KP), which are used to reconstruct the components within the plot area. Our novelty lies in detecting a fusion of continuous and discrete KP as predicted heatmaps. A combination of sparse and dense per-pixel objectives coupled with a uni-modal self-attention-based feature-fusion layer is applied to learn KP embeddings. Further leveraging deep metric learning for unsupervised clustering, allows us to segment the chart plot area into various objects. By further matching the chart components to the legend, we are able to obtain the data series names. A post-processing threshold is applied to the KP embeddings to refine the object reconstructions and improve accuracy. Our extensive experiments include an evaluation of different modules for KP estimation and the combination of deep layer aggregation and corner pooling approaches. The results of our experiments provide extensive evaluation for the task of real-world chart data extraction.\",\"PeriodicalId\":294655,\"journal\":{\"name\":\"IEEE International Conference on Document Analysis and Recognition\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE International Conference on Document Analysis and Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2308.01971\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2308.01971","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们介绍了一种新颖的自下而上的方法来提取图表数据。我们的模型利用图表图像作为输入，并学习检测关键点(KP)，用于重建地块区域内的组件。我们的新颖之处在于检测连续和离散KP的融合，如预测的热图。将稀疏和密集的每像素目标与基于单模态自关注的特征融合层相结合，用于学习KP嵌入。进一步利用深度度量学习进行无监督聚类，允许我们将图表区域分割成不同的对象。通过进一步将图表组件与图例进行匹配，我们能够获得数据系列名称。对KP嵌入应用后处理阈值，以细化目标重构，提高精度。我们广泛的实验包括对KP估计的不同模块的评估以及深层聚合和角池方法的组合。我们的实验结果为现实世界的图表数据提取任务提供了广泛的评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SpaDen : Sparse and Dense Keypoint Estimation for Real-World Chart Understanding

We introduce a novel bottom-up approach for the extraction of chart data. Our model utilizes images of charts as inputs and learns to detect keypoints (KP), which are used to reconstruct the components within the plot area. Our novelty lies in detecting a fusion of continuous and discrete KP as predicted heatmaps. A combination of sparse and dense per-pixel objectives coupled with a uni-modal self-attention-based feature-fusion layer is applied to learn KP embeddings. Further leveraging deep metric learning for unsupervised clustering, allows us to segment the chart plot area into various objects. By further matching the chart components to the legend, we are able to obtain the data series names. A post-processing threshold is applied to the KP embeddings to refine the object reconstructions and improve accuracy. Our extensive experiments include an evaluation of different modules for KP estimation and the combination of deep layer aggregation and corner pooling approaches. The results of our experiments provide extensive evaluation for the task of real-world chart data extraction.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE International Conference on Document Analysis and Recognition

自引率

0.00%

发文量