{"title":"Toward accurate psychological simulations: Investigating LLMs’ responses to personality and cultural variables","authors":"Chihao Li , Yue Qi","doi":"10.1016/j.chb.2025.108687","DOIUrl":null,"url":null,"abstract":"<div><div>Large language models (LLMs) have demonstrated complex language comprehension, making them potential tools for psychological research. However, challenges remain in assessing their psychological properties, particularly regarding prompt design, comparison with human data, and the ability to simulate real-world psychological differences across cultural groups. This study investigates how LLMs respond to personality assessments and cultural variables, addressing gaps in previous research.</div><div>In three studies, we explored the impact of model parameters and prompt variations on LLM responses to personality tests. Study 1 examined how temperature, model type, and prompt templates influenced LLM responses, revealing that while temperature settings had minimal impact, prompt variations led to significant differences. Study 2 compared LLMs' self-report personality scores with human data (N = 18,192–49,159), finding that LLMs scored higher in positive traits (e.g., extroversion) and lower in negative traits (e.g., psychopathy), reflecting their training biases. Study 3 tested LLMs’ ability to simulate cultural differences by assessing virtual subjects from China and the USA. While significant differences were observed between the groups, both demonstrated East Asian self-construal patterns, indicating limitations in simulating authentic cultural psychological differences.</div><div>These findings highlight the influence of prompt design on LLM responses, the divergence between LLM and human personality profiles, and the difficulty of simulating accurate cultural psychological differences. These results underscore the need for more refined methodologies in psychological simulations using LLMs and suggest that current models struggle to represent diverse human psychological traits accurately.</div></div>","PeriodicalId":48471,"journal":{"name":"Computers in Human Behavior","volume":"170 ","pages":"Article 108687"},"PeriodicalIF":9.0000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in Human Behavior","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0747563225001347","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Large language models (LLMs) have demonstrated complex language comprehension, making them potential tools for psychological research. However, challenges remain in assessing their psychological properties, particularly regarding prompt design, comparison with human data, and the ability to simulate real-world psychological differences across cultural groups. This study investigates how LLMs respond to personality assessments and cultural variables, addressing gaps in previous research.
In three studies, we explored the impact of model parameters and prompt variations on LLM responses to personality tests. Study 1 examined how temperature, model type, and prompt templates influenced LLM responses, revealing that while temperature settings had minimal impact, prompt variations led to significant differences. Study 2 compared LLMs' self-report personality scores with human data (N = 18,192–49,159), finding that LLMs scored higher in positive traits (e.g., extroversion) and lower in negative traits (e.g., psychopathy), reflecting their training biases. Study 3 tested LLMs’ ability to simulate cultural differences by assessing virtual subjects from China and the USA. While significant differences were observed between the groups, both demonstrated East Asian self-construal patterns, indicating limitations in simulating authentic cultural psychological differences.
These findings highlight the influence of prompt design on LLM responses, the divergence between LLM and human personality profiles, and the difficulty of simulating accurate cultural psychological differences. These results underscore the need for more refined methodologies in psychological simulations using LLMs and suggest that current models struggle to represent diverse human psychological traits accurately.
期刊介绍:
Computers in Human Behavior is a scholarly journal that explores the psychological aspects of computer use. It covers original theoretical works, research reports, literature reviews, and software and book reviews. The journal examines both the use of computers in psychology, psychiatry, and related fields, and the psychological impact of computer use on individuals, groups, and society. Articles discuss topics such as professional practice, training, research, human development, learning, cognition, personality, and social interactions. It focuses on human interactions with computers, considering the computer as a medium through which human behaviors are shaped and expressed. Professionals interested in the psychological aspects of computer use will find this journal valuable, even with limited knowledge of computers.