{"title":"AI-ChartParser: A Method For Extracting Experimental Data From Curve Charts in Academic Papers","authors":"Wenjin Yang, Jie He, Xiaotong Zhang, Haiyan Gong","doi":"10.1111/cgf.70146","DOIUrl":null,"url":null,"abstract":"<p>In the fields of engineering and natural sciences, curve charts serve as indispensable visualization tools for scientific research, product development and engineering design, as they encapsulate crucial data necessary for comprehensive analysis. Existing methodologies for data extraction from line charts predominantly depend on single-task models, which frequently exhibit limitations in efficiency and generalization. To overcome these challenges, we propose AI-ChartParser, an end-to-end deep learning model that employs multi-task learning to concurrently execute chart element detection, pivot point detection and curve detection. This approach effectively and efficiently parses diverse chart formats within a cohesive framework. Furthermore, we introduce an Interval-Mean Space-Numerical Mapping algorithm designed to address challenges in data range extraction, thereby significantly minimizing conversion errors. We have incorporated all the methodologies discussed in this paper to develop a comprehensive data extraction tool, facilitating the automatic conversion of line charts into tabular data. Our model exhibits exceptional performance on complex real-world datasets, achieving state-of-the-art accuracy and speed across all three tasks. To facilitate further research, the source codes and pre-trained models are released at https://github.com/ywking/ChartParser.git.</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"44 6","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics Forum","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cgf.70146","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
In the fields of engineering and natural sciences, curve charts serve as indispensable visualization tools for scientific research, product development and engineering design, as they encapsulate crucial data necessary for comprehensive analysis. Existing methodologies for data extraction from line charts predominantly depend on single-task models, which frequently exhibit limitations in efficiency and generalization. To overcome these challenges, we propose AI-ChartParser, an end-to-end deep learning model that employs multi-task learning to concurrently execute chart element detection, pivot point detection and curve detection. This approach effectively and efficiently parses diverse chart formats within a cohesive framework. Furthermore, we introduce an Interval-Mean Space-Numerical Mapping algorithm designed to address challenges in data range extraction, thereby significantly minimizing conversion errors. We have incorporated all the methodologies discussed in this paper to develop a comprehensive data extraction tool, facilitating the automatic conversion of line charts into tabular data. Our model exhibits exceptional performance on complex real-world datasets, achieving state-of-the-art accuracy and speed across all three tasks. To facilitate further research, the source codes and pre-trained models are released at https://github.com/ywking/ChartParser.git.
期刊介绍:
Computer Graphics Forum is the official journal of Eurographics, published in cooperation with Wiley-Blackwell, and is a unique, international source of information for computer graphics professionals interested in graphics developments worldwide. It is now one of the leading journals for researchers, developers and users of computer graphics in both commercial and academic environments. The journal reports on the latest developments in the field throughout the world and covers all aspects of the theory, practice and application of computer graphics.