代理社会：融合现实世界的骨架和大型语言模型的纹理

arXiv - CS - Computers and Society Pub Date : 2024-09-02 DOI:arxiv-2409.10550

Yuqi Bai, Kun Sun, Huishi Yin

{"title":"代理社会：融合现实世界的骨架和大型语言模型的纹理","authors":"Yuqi Bai, Kun Sun, Huishi Yin","doi":"arxiv-2409.10550","DOIUrl":null,"url":null,"abstract":"Recent advancements in large language models (LLMs) and agent technologies\noffer promising solutions to the simulation of social science experiments, but\nthe availability of data of real-world population required by many of them\nstill poses as a major challenge. This paper explores a novel framework that\nleverages census data and LLMs to generate virtual populations, significantly\nreducing resource requirements and bypassing privacy compliance issues\nassociated with real-world data, while keeping a statistical truthfulness.\nDrawing on real-world census data, our approach first generates a persona that\nreflects demographic characteristics of the population. We then employ LLMs to\nenrich these personas with intricate details, using techniques akin to those in\nimage generative models but applied to textual data. Additionally, we propose a\nframework for the evaluation of the feasibility of our method with respect to\ncapability of LLMs based on personality trait tests, specifically the Big Five\nmodel, which also enhances the depth and realism of the generated personas.\nThrough preliminary experiments and analysis, we demonstrate that our method\nproduces personas with variability essential for simulating diverse human\nbehaviors in social science experiments. But the evaluation result shows that\nonly weak sign of statistical truthfulness can be produced due to limited\ncapability of current LLMs. Insights from our study also highlight the tension\nwithin LLMs between aligning with human values and reflecting real-world\ncomplexities. Thorough and rigorous test call for further research. Our codes\nare released at https://github.com/baiyuqi/agentic-society.git","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Agentic Society: Merging skeleton from real world and texture from Large Language Model\",\"authors\":\"Yuqi Bai, Kun Sun, Huishi Yin\",\"doi\":\"arxiv-2409.10550\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent advancements in large language models (LLMs) and agent technologies\\noffer promising solutions to the simulation of social science experiments, but\\nthe availability of data of real-world population required by many of them\\nstill poses as a major challenge. This paper explores a novel framework that\\nleverages census data and LLMs to generate virtual populations, significantly\\nreducing resource requirements and bypassing privacy compliance issues\\nassociated with real-world data, while keeping a statistical truthfulness.\\nDrawing on real-world census data, our approach first generates a persona that\\nreflects demographic characteristics of the population. We then employ LLMs to\\nenrich these personas with intricate details, using techniques akin to those in\\nimage generative models but applied to textual data. Additionally, we propose a\\nframework for the evaluation of the feasibility of our method with respect to\\ncapability of LLMs based on personality trait tests, specifically the Big Five\\nmodel, which also enhances the depth and realism of the generated personas.\\nThrough preliminary experiments and analysis, we demonstrate that our method\\nproduces personas with variability essential for simulating diverse human\\nbehaviors in social science experiments. But the evaluation result shows that\\nonly weak sign of statistical truthfulness can be produced due to limited\\ncapability of current LLMs. Insights from our study also highlight the tension\\nwithin LLMs between aligning with human values and reflecting real-world\\ncomplexities. Thorough and rigorous test call for further research. Our codes\\nare released at https://github.com/baiyuqi/agentic-society.git\",\"PeriodicalId\":501112,\"journal\":{\"name\":\"arXiv - CS - Computers and Society\",\"volume\":\"8 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computers and Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.10550\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computers and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10550","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

大型语言模型（LLMs）和代理技术的最新进展为社会科学实验的模拟提供了前景广阔的解决方案，但其中许多实验所需的现实世界人口数据的可用性仍是一大挑战。本文探讨了一种新颖的框架，该框架利用人口普查数据和 LLM 生成虚拟人口，大大降低了资源需求，绕过了与真实世界数据相关的隐私合规问题，同时保持了统计真实性。然后，我们使用类似于图像生成模型的技术，并将其应用于文本数据，利用 LLM 来丰富这些角色的复杂细节。通过初步的实验和分析，我们证明了我们的方法所生成的角色具有可变性，这对于在社会科学实验中模拟人类的各种行为是必不可少的。但评估结果表明，由于目前的 LLM 能力有限，只能生成统计真实性较弱的角色。我们的研究还凸显了 LLM 在符合人类价值观和反映现实世界复杂性之间的矛盾。彻底而严格的测试需要进一步的研究。我们的代码发布在 https://github.com/baiyuqi/agentic-society.git

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Agentic Society: Merging skeleton from real world and texture from Large Language Model

Recent advancements in large language models (LLMs) and agent technologies offer promising solutions to the simulation of social science experiments, but the availability of data of real-world population required by many of them still poses as a major challenge. This paper explores a novel framework that leverages census data and LLMs to generate virtual populations, significantly reducing resource requirements and bypassing privacy compliance issues associated with real-world data, while keeping a statistical truthfulness. Drawing on real-world census data, our approach first generates a persona that reflects demographic characteristics of the population. We then employ LLMs to enrich these personas with intricate details, using techniques akin to those in image generative models but applied to textual data. Additionally, we propose a framework for the evaluation of the feasibility of our method with respect to capability of LLMs based on personality trait tests, specifically the Big Five model, which also enhances the depth and realism of the generated personas. Through preliminary experiments and analysis, we demonstrate that our method produces personas with variability essential for simulating diverse human behaviors in social science experiments. But the evaluation result shows that only weak sign of statistical truthfulness can be produced due to limited capability of current LLMs. Insights from our study also highlight the tension within LLMs between aligning with human values and reflecting real-world complexities. Thorough and rigorous test call for further research. Our codes are released at https://github.com/baiyuqi/agentic-society.git

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Computers and Society

自引率

0.00%

发文量