Emotional intelligence of Large Language Models

IF 2.4 3区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Journal of Pacific Rim Psychology Pub Date : 2023-01-01 DOI:10.1177/18344909231213958

Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Jia Liu

{"title":"Emotional intelligence of Large Language Models","authors":"Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Jia Liu","doi":"10.1177/18344909231213958","DOIUrl":null,"url":null,"abstract":"Large Language Models (LLMs) have demonstrated remarkable abilities across numerous disciplines, primarily assessed through tasks in language generation, knowledge utilization, and complex reasoning. However, their alignment with human emotions and values, which is critical for real-world applications, has not been systematically evaluated. Here, we assessed LLMs' Emotional Intelligence (EI), encompassing emotion recognition, interpretation, and understanding, which is necessary for effective communication and social interactions. Specifically, we first developed a novel psychometric assessment focusing on Emotion Understanding (EU), a core component of EI. This test is an objective, performance-driven, and text-based evaluation, which requires evaluating complex emotions in realistic scenarios, providing a consistent assessment for both human and LLM capabilities. With a reference frame constructed from over 500 adults, we tested a variety of mainstream LLMs. Most achieved above-average Emotional Quotient (EQ) scores, with GPT-4 exceeding 89% of human participants with an EQ of 117. Interestingly, a multivariate pattern analysis revealed that some LLMs apparently did not rely on the human-like mechanism to achieve human-level performance, as their representational patterns were qualitatively distinct from humans. In addition, we discussed the impact of factors such as model size, training method, and architecture on LLMs' EQ. In summary, our study presents one of the first psychometric evaluations of the human-like characteristics of LLMs, which may shed light on the future development of LLMs aiming for both high intellectual and emotional intelligence. Project website: https://emotional-intelligence.github.io/","PeriodicalId":45049,"journal":{"name":"Journal of Pacific Rim Psychology","volume":"32 1","pages":"0"},"PeriodicalIF":2.4000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pacific Rim Psychology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/18344909231213958","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Large Language Models (LLMs) have demonstrated remarkable abilities across numerous disciplines, primarily assessed through tasks in language generation, knowledge utilization, and complex reasoning. However, their alignment with human emotions and values, which is critical for real-world applications, has not been systematically evaluated. Here, we assessed LLMs' Emotional Intelligence (EI), encompassing emotion recognition, interpretation, and understanding, which is necessary for effective communication and social interactions. Specifically, we first developed a novel psychometric assessment focusing on Emotion Understanding (EU), a core component of EI. This test is an objective, performance-driven, and text-based evaluation, which requires evaluating complex emotions in realistic scenarios, providing a consistent assessment for both human and LLM capabilities. With a reference frame constructed from over 500 adults, we tested a variety of mainstream LLMs. Most achieved above-average Emotional Quotient (EQ) scores, with GPT-4 exceeding 89% of human participants with an EQ of 117. Interestingly, a multivariate pattern analysis revealed that some LLMs apparently did not rely on the human-like mechanism to achieve human-level performance, as their representational patterns were qualitatively distinct from humans. In addition, we discussed the impact of factors such as model size, training method, and architecture on LLMs' EQ. In summary, our study presents one of the first psychometric evaluations of the human-like characteristics of LLMs, which may shed light on the future development of LLMs aiming for both high intellectual and emotional intelligence. Project website: https://emotional-intelligence.github.io/

查看原文本刊更多论文

大型语言模型的情商

大型语言模型(llm)已经在许多学科中展示了非凡的能力，主要通过语言生成、知识利用和复杂推理等任务进行评估。然而，它们与人类情感和价值观的一致性，这对现实世界的应用至关重要，尚未得到系统的评估。在这里，我们评估了法学硕士的情绪智力(EI)，包括情绪识别，解释和理解，这是有效沟通和社会互动所必需的。具体而言，我们首先开发了一种新的心理测量评估，关注情绪理解(EU)，这是EI的核心组成部分。该测试是一个客观的、性能驱动的、基于文本的评估，它需要评估现实场景中的复杂情绪，为人类和法学硕士的能力提供一致的评估。以500多名成年人为参照系，我们测试了各种主流法学硕士。大多数人的情商(EQ)得分高于平均水平，GPT-4超过了89%的情商为117的人类参与者。有趣的是，一项多变量模式分析显示，一些llm显然不依赖于类似人类的机制来实现人类水平的表现，因为它们的表征模式在质量上与人类不同。此外，我们还讨论了模型大小、训练方法和架构等因素对法学硕士情商的影响。总之，我们的研究首次提出了法学硕士类人特征的心理测量评估之一，这可能为法学硕士的未来发展提供启示。法学硕士的目标是高智力和情商。项目网站:https://emotional-intelligence.github.io/

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊