使用大型语言模型从社交媒体活动推断个性：跨模型一致性、时间稳定性和自我报告的收敛效度

IF 2.7 1区心理学 Q1 Psychology

Journal of Personality Pub Date : 2025-09-02 DOI:10.1111/jopy.70019

Davide Marengo, Christian Montag, Michele Settanni

{"title":"使用大型语言模型从社交媒体活动推断个性：跨模型一致性、时间稳定性和自我报告的收敛效度","authors":"Davide Marengo, Christian Montag, Michele Settanni","doi":"10.1111/jopy.70019","DOIUrl":null,"url":null,"abstract":"IntroductionLarge language models (LLMs) offer a promising approach to infer personality traits unobtrusively from digital footprints. However, the reliability and validity of these inferences remain underexplored.MethodGemini 1.5 Pro and GPT‐4o were used to infer Big Five traits from 2 years of Facebook posts by 1214 Italian users. Predictions were compared to self‐reports on the Ten‐Item Personality Inventory.ResultsLLM predictions underestimated Agreeableness and Conscientiousness, overestimated Extraversion, while Neuroticism and Openness closely aligned with self‐report means. On repeated prompting, Gemini 1.5 Pro inferences showed less variability than GPT‐4o, with both models achieving excellent reliability when aggregating inferences. Temporal stability was highest when combining predictions across LLMs, with test–retest correlations over 2 years ranging from 0.44 for Conscientiousness to 0.60 for Openness. Cross‐LLM agreement was highest when combining inferences from multiple time points, with correlations ranging from 0.58 for Neuroticism to 0.83 for Extraversion. Correlations with self‐reports were modest, reaching 0.27 for Extraversion, 0.24 for Agreeableness, 0.23 for Conscientiousness, 0.18 for Neuroticism, and 0.31 for Openness when combining LLM inferences across LLMs and time points.ConclusionThese findings advance understanding of LLMs' potential for personality inference, highlighting the importance of aggregating inferences to enhance the reliability and validity of such assessments.","PeriodicalId":48421,"journal":{"name":"Journal of Personality","volume":"24 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Inferring Personality From Social Media Activity Using Large Language Models: Cross‐Model Agreement, Temporal Stability, and Convergent Validity With Self‐Reports\",\"authors\":\"Davide Marengo, Christian Montag, Michele Settanni\",\"doi\":\"10.1111/jopy.70019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"IntroductionLarge language models (LLMs) offer a promising approach to infer personality traits unobtrusively from digital footprints. However, the reliability and validity of these inferences remain underexplored.MethodGemini 1.5 Pro and GPT‐4o were used to infer Big Five traits from 2 years of Facebook posts by 1214 Italian users. Predictions were compared to self‐reports on the Ten‐Item Personality Inventory.ResultsLLM predictions underestimated Agreeableness and Conscientiousness, overestimated Extraversion, while Neuroticism and Openness closely aligned with self‐report means. On repeated prompting, Gemini 1.5 Pro inferences showed less variability than GPT‐4o, with both models achieving excellent reliability when aggregating inferences. Temporal stability was highest when combining predictions across LLMs, with test–retest correlations over 2 years ranging from 0.44 for Conscientiousness to 0.60 for Openness. Cross‐LLM agreement was highest when combining inferences from multiple time points, with correlations ranging from 0.58 for Neuroticism to 0.83 for Extraversion. Correlations with self‐reports were modest, reaching 0.27 for Extraversion, 0.24 for Agreeableness, 0.23 for Conscientiousness, 0.18 for Neuroticism, and 0.31 for Openness when combining LLM inferences across LLMs and time points.ConclusionThese findings advance understanding of LLMs' potential for personality inference, highlighting the importance of aggregating inferences to enhance the reliability and validity of such assessments.\",\"PeriodicalId\":48421,\"journal\":{\"name\":\"Journal of Personality\",\"volume\":\"24 1\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Personality\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1111/jopy.70019\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Psychology\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Personality","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1111/jopy.70019","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Psychology","Score":null,"Total":0}

引用次数: 0

摘要

大型语言模型（llm）提供了一种很有前途的方法，可以从数字足迹中推断出个性特征。然而，这些推论的可靠性和有效性仍未得到充分探讨。方法使用gemini 1.5 Pro和GPT‐40从1214名意大利用户2年的Facebook帖子中推断出五大特征。预测结果与十项人格量表的自我报告进行了比较。结果sllm预测低估了宜人性和尽责性，高估了外向性，而神经质和开放性与自我报告方法密切相关。在反复提示下，Gemini 1.5 Pro推断比GPT‐40显示出更小的可变性，两种模型在汇总推断时都具有出色的可靠性。当结合llm的预测时，时间稳定性最高，2年内的重测相关性从尽责性的0.44到开放性的0.60不等。当结合多个时间点的推断时，跨LLM一致性最高，相关性从神经质的0.58到外向性的0.83不等。与自我报告的相关性是适度的，当结合跨LLM和时间点的LLM推断时，外向性为0.27，亲和性为0.24，尽责性为0.23，神经质为0.18，开放性为0.31。结论这些发现促进了对法学硕士人格推断潜力的理解，强调了汇总推断对提高此类评估的信度和效度的重要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Inferring Personality From Social Media Activity Using Large Language Models: Cross‐Model Agreement, Temporal Stability, and Convergent Validity With Self‐Reports

IntroductionLarge language models (LLMs) offer a promising approach to infer personality traits unobtrusively from digital footprints. However, the reliability and validity of these inferences remain underexplored.MethodGemini 1.5 Pro and GPT‐4o were used to infer Big Five traits from 2 years of Facebook posts by 1214 Italian users. Predictions were compared to self‐reports on the Ten‐Item Personality Inventory.ResultsLLM predictions underestimated Agreeableness and Conscientiousness, overestimated Extraversion, while Neuroticism and Openness closely aligned with self‐report means. On repeated prompting, Gemini 1.5 Pro inferences showed less variability than GPT‐4o, with both models achieving excellent reliability when aggregating inferences. Temporal stability was highest when combining predictions across LLMs, with test–retest correlations over 2 years ranging from 0.44 for Conscientiousness to 0.60 for Openness. Cross‐LLM agreement was highest when combining inferences from multiple time points, with correlations ranging from 0.58 for Neuroticism to 0.83 for Extraversion. Correlations with self‐reports were modest, reaching 0.27 for Extraversion, 0.24 for Agreeableness, 0.23 for Conscientiousness, 0.18 for Neuroticism, and 0.31 for Openness when combining LLM inferences across LLMs and time points.ConclusionThese findings advance understanding of LLMs' potential for personality inference, highlighting the importance of aggregating inferences to enhance the reliability and validity of such assessments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Personality PSYCHOLOGY, SOCIAL-

CiteScore

9.60

自引率

6.00%

发文量

100

期刊介绍： Journal of Personality publishes scientific investigations in the field of personality. It focuses particularly on personality and behavior dynamics, personality development, and individual differences in the cognitive, affective, and interpersonal domains. The journal reflects and stimulates interest in the growth of new theoretical and methodological approaches in personality psychology.