{"title":"Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams","authors":"Ethan Callanan, Amarachi Mbakwe, Antony Papadimitriou, Yulong Pei, Mathieu Sibue, Xiaodan Zhu, Zhiqiang Ma, Xiaomo Liu, Sameena Shah","doi":"arxiv-2310.08678","DOIUrl":null,"url":null,"abstract":"Large Language Models (LLMs) have demonstrated remarkable performance on a\nwide range of Natural Language Processing (NLP) tasks, often matching or even\nbeating state-of-the-art task-specific models. This study aims at assessing the\nfinancial reasoning capabilities of LLMs. We leverage mock exam questions of\nthe Chartered Financial Analyst (CFA) Program to conduct a comprehensive\nevaluation of ChatGPT and GPT-4 in financial analysis, considering Zero-Shot\n(ZS), Chain-of-Thought (CoT), and Few-Shot (FS) scenarios. We present an\nin-depth analysis of the models' performance and limitations, and estimate\nwhether they would have a chance at passing the CFA exams. Finally, we outline\ninsights into potential strategies and improvements to enhance the\napplicability of LLMs in finance. In this perspective, we hope this work paves\nthe way for future studies to continue enhancing LLMs for financial reasoning\nthrough rigorous evaluation.","PeriodicalId":501372,"journal":{"name":"arXiv - QuantFin - General Finance","volume":"202 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - General Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2310.08678","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Large Language Models (LLMs) have demonstrated remarkable performance on a
wide range of Natural Language Processing (NLP) tasks, often matching or even
beating state-of-the-art task-specific models. This study aims at assessing the
financial reasoning capabilities of LLMs. We leverage mock exam questions of
the Chartered Financial Analyst (CFA) Program to conduct a comprehensive
evaluation of ChatGPT and GPT-4 in financial analysis, considering Zero-Shot
(ZS), Chain-of-Thought (CoT), and Few-Shot (FS) scenarios. We present an
in-depth analysis of the models' performance and limitations, and estimate
whether they would have a chance at passing the CFA exams. Finally, we outline
insights into potential strategies and improvements to enhance the
applicability of LLMs in finance. In this perspective, we hope this work paves
the way for future studies to continue enhancing LLMs for financial reasoning
through rigorous evaluation.