Davide Marengo, Christian Montag, Michele Settanni
{"title":"使用大型语言模型从社交媒体活动推断个性:跨模型一致性、时间稳定性和自我报告的收敛效度","authors":"Davide Marengo, Christian Montag, Michele Settanni","doi":"10.1111/jopy.70019","DOIUrl":null,"url":null,"abstract":"IntroductionLarge language models (LLMs) offer a promising approach to infer personality traits unobtrusively from digital footprints. However, the reliability and validity of these inferences remain underexplored.MethodGemini 1.5 Pro and GPT‐4o were used to infer Big Five traits from 2 years of Facebook posts by 1214 Italian users. Predictions were compared to self‐reports on the Ten‐Item Personality Inventory.ResultsLLM predictions underestimated Agreeableness and Conscientiousness, overestimated Extraversion, while Neuroticism and Openness closely aligned with self‐report means. On repeated prompting, Gemini 1.5 Pro inferences showed less variability than GPT‐4o, with both models achieving excellent reliability when aggregating inferences. Temporal stability was highest when combining predictions across LLMs, with test–retest correlations over 2 years ranging from 0.44 for Conscientiousness to 0.60 for Openness. Cross‐LLM agreement was highest when combining inferences from multiple time points, with correlations ranging from 0.58 for Neuroticism to 0.83 for Extraversion. Correlations with self‐reports were modest, reaching 0.27 for Extraversion, 0.24 for Agreeableness, 0.23 for Conscientiousness, 0.18 for Neuroticism, and 0.31 for Openness when combining LLM inferences across LLMs and time points.ConclusionThese findings advance understanding of LLMs' potential for personality inference, highlighting the importance of aggregating inferences to enhance the reliability and validity of such assessments.","PeriodicalId":48421,"journal":{"name":"Journal of Personality","volume":"24 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Inferring Personality From Social Media Activity Using Large Language Models: Cross‐Model Agreement, Temporal Stability, and Convergent Validity With Self‐Reports\",\"authors\":\"Davide Marengo, Christian Montag, Michele Settanni\",\"doi\":\"10.1111/jopy.70019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"IntroductionLarge language models (LLMs) offer a promising approach to infer personality traits unobtrusively from digital footprints. However, the reliability and validity of these inferences remain underexplored.MethodGemini 1.5 Pro and GPT‐4o were used to infer Big Five traits from 2 years of Facebook posts by 1214 Italian users. Predictions were compared to self‐reports on the Ten‐Item Personality Inventory.ResultsLLM predictions underestimated Agreeableness and Conscientiousness, overestimated Extraversion, while Neuroticism and Openness closely aligned with self‐report means. On repeated prompting, Gemini 1.5 Pro inferences showed less variability than GPT‐4o, with both models achieving excellent reliability when aggregating inferences. Temporal stability was highest when combining predictions across LLMs, with test–retest correlations over 2 years ranging from 0.44 for Conscientiousness to 0.60 for Openness. Cross‐LLM agreement was highest when combining inferences from multiple time points, with correlations ranging from 0.58 for Neuroticism to 0.83 for Extraversion. Correlations with self‐reports were modest, reaching 0.27 for Extraversion, 0.24 for Agreeableness, 0.23 for Conscientiousness, 0.18 for Neuroticism, and 0.31 for Openness when combining LLM inferences across LLMs and time points.ConclusionThese findings advance understanding of LLMs' potential for personality inference, highlighting the importance of aggregating inferences to enhance the reliability and validity of such assessments.\",\"PeriodicalId\":48421,\"journal\":{\"name\":\"Journal of Personality\",\"volume\":\"24 1\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Personality\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1111/jopy.70019\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Psychology\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Personality","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1111/jopy.70019","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Psychology","Score":null,"Total":0}
Inferring Personality From Social Media Activity Using Large Language Models: Cross‐Model Agreement, Temporal Stability, and Convergent Validity With Self‐Reports
IntroductionLarge language models (LLMs) offer a promising approach to infer personality traits unobtrusively from digital footprints. However, the reliability and validity of these inferences remain underexplored.MethodGemini 1.5 Pro and GPT‐4o were used to infer Big Five traits from 2 years of Facebook posts by 1214 Italian users. Predictions were compared to self‐reports on the Ten‐Item Personality Inventory.ResultsLLM predictions underestimated Agreeableness and Conscientiousness, overestimated Extraversion, while Neuroticism and Openness closely aligned with self‐report means. On repeated prompting, Gemini 1.5 Pro inferences showed less variability than GPT‐4o, with both models achieving excellent reliability when aggregating inferences. Temporal stability was highest when combining predictions across LLMs, with test–retest correlations over 2 years ranging from 0.44 for Conscientiousness to 0.60 for Openness. Cross‐LLM agreement was highest when combining inferences from multiple time points, with correlations ranging from 0.58 for Neuroticism to 0.83 for Extraversion. Correlations with self‐reports were modest, reaching 0.27 for Extraversion, 0.24 for Agreeableness, 0.23 for Conscientiousness, 0.18 for Neuroticism, and 0.31 for Openness when combining LLM inferences across LLMs and time points.ConclusionThese findings advance understanding of LLMs' potential for personality inference, highlighting the importance of aggregating inferences to enhance the reliability and validity of such assessments.
期刊介绍:
Journal of Personality publishes scientific investigations in the field of personality. It focuses particularly on personality and behavior dynamics, personality development, and individual differences in the cognitive, affective, and interpersonal domains. The journal reflects and stimulates interest in the growth of new theoretical and methodological approaches in personality psychology.