{"title":"模拟公众意见:比较llm和随机森林的分布和个人水平预测。","authors":"Fernando Miranda, Pedro Paulo Balbi","doi":"10.3390/e27090923","DOIUrl":null,"url":null,"abstract":"<p><p>Understanding and modeling the flow of information in human societies is essential for capturing phenomena such as polarization, opinion formation, and misinformation diffusion. Traditional agent-based models often rely on simplified behavioral rules that fail to capture the nuanced and context-sensitive nature of human decision-making. In this study, we explore the potential of Large Language Models (LLMs) as data-driven, high-fidelity agents capable of simulating individual opinions under varying informational conditions. Conditioning LLMs on real survey data from the 2020 American National Election Studies (ANES), we investigate their ability to predict individual-level responses across a spectrum of political and social issues in a zero-shot setting, without any training on the survey outcomes. Using Jensen-Shannon distance to quantify divergence in opinion distributions and F1-score to measure predictive accuracy, we compare LLM-generated simulations to those produced by a supervised Random Forest model. While performance at the individual level is comparable, LLMs consistently produce aggregate opinion distributions closer to the empirical ground truth. These findings suggest that LLMs offer a promising new method for simulating complex opinion dynamics and modeling the probabilistic structure of belief systems in computational social science.</p>","PeriodicalId":11694,"journal":{"name":"Entropy","volume":"27 9","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12468613/pdf/","citationCount":"0","resultStr":"{\"title\":\"Simulating Public Opinion: Comparing Distributional and Individual-Level Predictions from LLMs and Random Forests.\",\"authors\":\"Fernando Miranda, Pedro Paulo Balbi\",\"doi\":\"10.3390/e27090923\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Understanding and modeling the flow of information in human societies is essential for capturing phenomena such as polarization, opinion formation, and misinformation diffusion. Traditional agent-based models often rely on simplified behavioral rules that fail to capture the nuanced and context-sensitive nature of human decision-making. In this study, we explore the potential of Large Language Models (LLMs) as data-driven, high-fidelity agents capable of simulating individual opinions under varying informational conditions. Conditioning LLMs on real survey data from the 2020 American National Election Studies (ANES), we investigate their ability to predict individual-level responses across a spectrum of political and social issues in a zero-shot setting, without any training on the survey outcomes. Using Jensen-Shannon distance to quantify divergence in opinion distributions and F1-score to measure predictive accuracy, we compare LLM-generated simulations to those produced by a supervised Random Forest model. While performance at the individual level is comparable, LLMs consistently produce aggregate opinion distributions closer to the empirical ground truth. These findings suggest that LLMs offer a promising new method for simulating complex opinion dynamics and modeling the probabilistic structure of belief systems in computational social science.</p>\",\"PeriodicalId\":11694,\"journal\":{\"name\":\"Entropy\",\"volume\":\"27 9\",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12468613/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Entropy\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.3390/e27090923\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PHYSICS, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Entropy","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.3390/e27090923","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
Simulating Public Opinion: Comparing Distributional and Individual-Level Predictions from LLMs and Random Forests.
Understanding and modeling the flow of information in human societies is essential for capturing phenomena such as polarization, opinion formation, and misinformation diffusion. Traditional agent-based models often rely on simplified behavioral rules that fail to capture the nuanced and context-sensitive nature of human decision-making. In this study, we explore the potential of Large Language Models (LLMs) as data-driven, high-fidelity agents capable of simulating individual opinions under varying informational conditions. Conditioning LLMs on real survey data from the 2020 American National Election Studies (ANES), we investigate their ability to predict individual-level responses across a spectrum of political and social issues in a zero-shot setting, without any training on the survey outcomes. Using Jensen-Shannon distance to quantify divergence in opinion distributions and F1-score to measure predictive accuracy, we compare LLM-generated simulations to those produced by a supervised Random Forest model. While performance at the individual level is comparable, LLMs consistently produce aggregate opinion distributions closer to the empirical ground truth. These findings suggest that LLMs offer a promising new method for simulating complex opinion dynamics and modeling the probabilistic structure of belief systems in computational social science.
期刊介绍:
Entropy (ISSN 1099-4300), an international and interdisciplinary journal of entropy and information studies, publishes reviews, regular research papers and short notes. Our aim is to encourage scientists to publish as much as possible their theoretical and experimental details. There is no restriction on the length of the papers. If there are computation and the experiment, the details must be provided so that the results can be reproduced.