RealCQA: Scientific Chart Question Answering as a Test-bed for First-Order Logic

IEEE International Conference on Document Analysis and Recognition Pub Date : 2023-08-03 DOI:10.48550/arXiv.2308.01979

Saleem Ahmed, Bhavin Jawade, Shubham Pandey, S. Setlur, Venugopal Govindaraju

{"title":"RealCQA: Scientific Chart Question Answering as a Test-bed for First-Order Logic","authors":"Saleem Ahmed, Bhavin Jawade, Shubham Pandey, S. Setlur, Venugopal Govindaraju","doi":"10.48550/arXiv.2308.01979","DOIUrl":null,"url":null,"abstract":"We present a comprehensive study of chart visual question-answering(QA) task, to address the challenges faced in comprehending and extracting data from chart visualizations within documents. Despite efforts to tackle this problem using synthetic charts, solutions are limited by the shortage of annotated real-world data. To fill this gap, we introduce a benchmark and dataset for chart visual QA on real-world charts, offering a systematic analysis of the task and a novel taxonomy for template-based chart question creation. Our contribution includes the introduction of a new answer type, 'list', with both ranked and unranked variations. Our study is conducted on a real-world chart dataset from scientific literature, showcasing higher visual complexity compared to other works. Our focus is on template-based QA and how it can serve as a standard for evaluating the first-order logic capabilities of models. The results of our experiments, conducted on a real-world out-of-distribution dataset, provide a robust evaluation of large-scale pre-trained models and advance the field of chart visual QA and formal logic verification for neural networks in general.","PeriodicalId":294655,"journal":{"name":"IEEE International Conference on Document Analysis and Recognition","volume":" 70","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2308.01979","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We present a comprehensive study of chart visual question-answering(QA) task, to address the challenges faced in comprehending and extracting data from chart visualizations within documents. Despite efforts to tackle this problem using synthetic charts, solutions are limited by the shortage of annotated real-world data. To fill this gap, we introduce a benchmark and dataset for chart visual QA on real-world charts, offering a systematic analysis of the task and a novel taxonomy for template-based chart question creation. Our contribution includes the introduction of a new answer type, 'list', with both ranked and unranked variations. Our study is conducted on a real-world chart dataset from scientific literature, showcasing higher visual complexity compared to other works. Our focus is on template-based QA and how it can serve as a standard for evaluating the first-order logic capabilities of models. The results of our experiments, conducted on a real-world out-of-distribution dataset, provide a robust evaluation of large-scale pre-trained models and advance the field of chart visual QA and formal logic verification for neural networks in general.

查看原文本刊更多论文

RealCQA:科学图表问答作为一阶逻辑的测试平台

我们提出了图表可视化问答(QA)任务的综合研究，以解决在文档中的图表可视化中理解和提取数据所面临的挑战。尽管努力使用合成图表来解决这个问题，但解决方案受到缺乏带注释的实际数据的限制。为了填补这一空白，我们为现实世界图表的可视化QA引入了一个基准和数据集，提供了对任务的系统分析和基于模板的图表问题创建的新分类。我们的贡献包括引入一个新的答案类型，“列表”，有排名和未排名的变化。我们的研究是在来自科学文献的真实世界图表数据集上进行的，与其他作品相比，它展示了更高的视觉复杂性。我们的重点是基于模板的QA，以及它如何作为评估模型一阶逻辑能力的标准。我们的实验结果是在真实世界的分布外数据集上进行的，为大规模预训练模型提供了稳健的评估，并在总体上推进了神经网络的图表可视化QA和形式化逻辑验证领域。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE International Conference on Document Analysis and Recognition

自引率

0.00%

发文量