DataAgent: Evaluating Large Language Models’ Ability to Answer Zero-Shot, Natural Language Queries

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC) Pub Date : 2024-02-07 DOI:10.1109/ICAIC60265.2024.10433803

Manit Mishra, Abderrahman Braham, Charles Marsom, Bryan Chung, Gavin Griffin, Dakshesh Sidnerlikar, Chatanya Sarin, Arjun Rajaram

{"title":"DataAgent: Evaluating Large Language Models’ Ability to Answer Zero-Shot, Natural Language Queries","authors":"Manit Mishra, Abderrahman Braham, Charles Marsom, Bryan Chung, Gavin Griffin, Dakshesh Sidnerlikar, Chatanya Sarin, Arjun Rajaram","doi":"10.1109/ICAIC60265.2024.10433803","DOIUrl":null,"url":null,"abstract":"Conventional processes for analyzing datasets and extracting meaningful information are often time-consuming and laborious. Previous work has identified manual, repetitive coding and data collection as major obstacles that hinder data scientists from undertaking more nuanced labor and high-level projects. To combat this, we evaluated OpenAI’s GPT-3.5 as a \"Language Data Scientist\" (LDS) that can extrapolate key findings, including correlations and basic information, from a given dataset. The model was tested on a diverse set of benchmark datasets to evaluate its performance across multiple standards, including data science code-generation based tasks involving libraries such as NumPy, Pandas, Scikit-Learn, and TensorFlow, and was broadly successful in correctly answering a given data science query related to the benchmark dataset. The LDS used various novel prompt engineering techniques to effectively answer a given question, including Chain-of-Thought reinforcement and SayCan prompt engineering. Our findings demonstrate great potential for leveraging Large Language Models for low-level, zero-shot data analysis.","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"144 1","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIC60265.2024.10433803","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Conventional processes for analyzing datasets and extracting meaningful information are often time-consuming and laborious. Previous work has identified manual, repetitive coding and data collection as major obstacles that hinder data scientists from undertaking more nuanced labor and high-level projects. To combat this, we evaluated OpenAI’s GPT-3.5 as a "Language Data Scientist" (LDS) that can extrapolate key findings, including correlations and basic information, from a given dataset. The model was tested on a diverse set of benchmark datasets to evaluate its performance across multiple standards, including data science code-generation based tasks involving libraries such as NumPy, Pandas, Scikit-Learn, and TensorFlow, and was broadly successful in correctly answering a given data science query related to the benchmark dataset. The LDS used various novel prompt engineering techniques to effectively answer a given question, including Chain-of-Thought reinforcement and SayCan prompt engineering. Our findings demonstrate great potential for leveraging Large Language Models for low-level, zero-shot data analysis.

查看原文本刊更多论文

数据代理：评估大型语言模型回答零即时自然语言查询的能力

分析数据集和提取有意义信息的传统流程往往费时费力。以往的工作发现，人工重复编码和数据收集是阻碍数据科学家开展更细致入微的工作和高级别项目的主要障碍。为了解决这个问题，我们将 OpenAI 的 GPT-3.5 评估为 "语言数据科学家"（LDS），它可以从给定的数据集中推断出关键结论，包括相关性和基本信息。该模型在一组不同的基准数据集上进行了测试，以评估其在多种标准下的性能，包括基于数据科学代码生成的任务，其中涉及 NumPy、Pandas、Scikit-Learn 和 TensorFlow 等库。LDS 采用了各种新颖的提示工程技术，包括思维链强化和 SayCan 提示工程，以有效回答给定问题。我们的研究结果表明，利用大型语言模型进行底层、零镜头数据分析具有巨大的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)

自引率

0.00%

发文量