Swin-chart: An efficient approach for chart classification

IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Anurag Dhote , Mohammed Javed , David S. Doermann
{"title":"Swin-chart: An efficient approach for chart classification","authors":"Anurag Dhote ,&nbsp;Mohammed Javed ,&nbsp;David S. Doermann","doi":"10.1016/j.patrec.2024.08.012","DOIUrl":null,"url":null,"abstract":"<div><p>Charts are a visualization tool used in scientific documents to facilitate easy comprehension of complex relationships underlying data and experiments. Researchers use various chart types to convey scientific information, so the problem of data extraction and subsequent chart understanding becomes very challenging. Many studies have been taken up in the literature to address the problem of chart mining, whose motivation is to facilitate the editing of existing charts, carry out extrapolative studies, and provide a deeper understanding of the underlying data. The first step towards chart understanding is chart classification, for which traditional ML and CNN-based deep learning models have been used in the literature. In this paper, we propose Swin-Chart, a Swin transformer-based deep learning approach for chart classification, which generalizes well across multiple datasets with a wide range of chart categories. Swin-Chart comprises a pre-trained Swin Transformer, a finetuning component, and a weight averaging component. The proposed approach is tested on a five-chart image benchmark dataset. We observed that the Swin-Chart model outperformers existing state-of-the-art models on all the datasets. Furthermore, we also provide an ablation study of the Swin-Chart model with all five datasets to understand the importance of various sub-parts such as the back-bone Swin transformer model, the value of several best weights selected for the weight averaging component, and the presence of the weight averaging component itself.</p><p>The Swin-Chart model also received first position in the chart classification task on the latest dataset in the CHART Infographics competition at ICDAR 2023 - chartinfo.github.io.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"185 ","pages":"Pages 203-209"},"PeriodicalIF":3.9000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865524002447","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Charts are a visualization tool used in scientific documents to facilitate easy comprehension of complex relationships underlying data and experiments. Researchers use various chart types to convey scientific information, so the problem of data extraction and subsequent chart understanding becomes very challenging. Many studies have been taken up in the literature to address the problem of chart mining, whose motivation is to facilitate the editing of existing charts, carry out extrapolative studies, and provide a deeper understanding of the underlying data. The first step towards chart understanding is chart classification, for which traditional ML and CNN-based deep learning models have been used in the literature. In this paper, we propose Swin-Chart, a Swin transformer-based deep learning approach for chart classification, which generalizes well across multiple datasets with a wide range of chart categories. Swin-Chart comprises a pre-trained Swin Transformer, a finetuning component, and a weight averaging component. The proposed approach is tested on a five-chart image benchmark dataset. We observed that the Swin-Chart model outperformers existing state-of-the-art models on all the datasets. Furthermore, we also provide an ablation study of the Swin-Chart model with all five datasets to understand the importance of various sub-parts such as the back-bone Swin transformer model, the value of several best weights selected for the weight averaging component, and the presence of the weight averaging component itself.

The Swin-Chart model also received first position in the chart classification task on the latest dataset in the CHART Infographics competition at ICDAR 2023 - chartinfo.github.io.

斯温图表图表分类的有效方法
图表是科学文献中的一种可视化工具,便于理解数据和实验背后的复杂关系。研究人员使用各种类型的图表来传递科学信息,因此数据提取和后续图表理解问题变得非常具有挑战性。针对图表挖掘问题,许多文献都进行了研究,其动机是促进现有图表的编辑,开展推断研究,并提供对基础数据的更深入理解。图表理解的第一步是图表分类,文献中使用了传统的 ML 和基于 CNN 的深度学习模型。在本文中,我们提出了一种基于 Swin 变换器的图表分类深度学习方法--Swin-Chart,它能在具有广泛图表类别的多个数据集上实现良好的泛化。Swin-Chart 由一个预训练的 Swin 变换器、一个微调组件和一个权重平均组件组成。我们在五个图表图像基准数据集上对所提出的方法进行了测试。我们发现,在所有数据集上,Swin-Chart 模型都优于现有的最先进模型。此外,我们还利用所有五个数据集对 Swin-Chart 模型进行了消融研究,以了解各个子部分的重要性,如骨干 Swin 变换器模型、为权重平均组件选择的几个最佳权重的值以及权重平均组件本身的存在。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Pattern Recognition Letters
Pattern Recognition Letters 工程技术-计算机:人工智能
CiteScore
12.40
自引率
5.90%
发文量
287
审稿时长
9.1 months
期刊介绍: Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition. Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信