AI-ChartParser: A Method For Extracting Experimental Data From Curve Charts in Academic Papers

IF 2.9 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Wenjin Yang, Jie He, Xiaotong Zhang, Haiyan Gong
{"title":"AI-ChartParser: A Method For Extracting Experimental Data From Curve Charts in Academic Papers","authors":"Wenjin Yang,&nbsp;Jie He,&nbsp;Xiaotong Zhang,&nbsp;Haiyan Gong","doi":"10.1111/cgf.70146","DOIUrl":null,"url":null,"abstract":"<p>In the fields of engineering and natural sciences, curve charts serve as indispensable visualization tools for scientific research, product development and engineering design, as they encapsulate crucial data necessary for comprehensive analysis. Existing methodologies for data extraction from line charts predominantly depend on single-task models, which frequently exhibit limitations in efficiency and generalization. To overcome these challenges, we propose AI-ChartParser, an end-to-end deep learning model that employs multi-task learning to concurrently execute chart element detection, pivot point detection and curve detection. This approach effectively and efficiently parses diverse chart formats within a cohesive framework. Furthermore, we introduce an Interval-Mean Space-Numerical Mapping algorithm designed to address challenges in data range extraction, thereby significantly minimizing conversion errors. We have incorporated all the methodologies discussed in this paper to develop a comprehensive data extraction tool, facilitating the automatic conversion of line charts into tabular data. Our model exhibits exceptional performance on complex real-world datasets, achieving state-of-the-art accuracy and speed across all three tasks. To facilitate further research, the source codes and pre-trained models are released at https://github.com/ywking/ChartParser.git.</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"44 6","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics Forum","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cgf.70146","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

In the fields of engineering and natural sciences, curve charts serve as indispensable visualization tools for scientific research, product development and engineering design, as they encapsulate crucial data necessary for comprehensive analysis. Existing methodologies for data extraction from line charts predominantly depend on single-task models, which frequently exhibit limitations in efficiency and generalization. To overcome these challenges, we propose AI-ChartParser, an end-to-end deep learning model that employs multi-task learning to concurrently execute chart element detection, pivot point detection and curve detection. This approach effectively and efficiently parses diverse chart formats within a cohesive framework. Furthermore, we introduce an Interval-Mean Space-Numerical Mapping algorithm designed to address challenges in data range extraction, thereby significantly minimizing conversion errors. We have incorporated all the methodologies discussed in this paper to develop a comprehensive data extraction tool, facilitating the automatic conversion of line charts into tabular data. Our model exhibits exceptional performance on complex real-world datasets, achieving state-of-the-art accuracy and speed across all three tasks. To facilitate further research, the source codes and pre-trained models are released at https://github.com/ywking/ChartParser.git.

Abstract Image

AI-ChartParser:一种从学术论文曲线图中提取实验数据的方法
在工程和自然科学领域,曲线图是科学研究、产品开发和工程设计中不可或缺的可视化工具,因为它们包含了全面分析所需的关键数据。现有的从折线图中提取数据的方法主要依赖于单任务模型,这些模型在效率和泛化方面经常受到限制。为了克服这些挑战,我们提出了AI-ChartParser,这是一种端到端深度学习模型,采用多任务学习同时执行图表元素检测、轴心点检测和曲线检测。这种方法在一个内聚框架内有效且高效地解析各种图表格式。此外,我们引入了一种区间均值空间数值映射算法,旨在解决数据范围提取的挑战,从而显著减少转换误差。我们结合了本文中讨论的所有方法来开发一个全面的数据提取工具,方便将折线图自动转换为表格数据。我们的模型在复杂的现实世界数据集上表现出色,在所有三个任务中实现了最先进的准确性和速度。为了便于进一步研究,源代码和预训练模型发布在https://github.com/ywking/ChartParser.git。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computer Graphics Forum
Computer Graphics Forum 工程技术-计算机:软件工程
CiteScore
5.80
自引率
12.00%
发文量
175
审稿时长
3-6 weeks
期刊介绍: Computer Graphics Forum is the official journal of Eurographics, published in cooperation with Wiley-Blackwell, and is a unique, international source of information for computer graphics professionals interested in graphics developments worldwide. It is now one of the leading journals for researchers, developers and users of computer graphics in both commercial and academic environments. The journal reports on the latest developments in the field throughout the world and covers all aspects of the theory, practice and application of computer graphics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信