用于可信社会模拟的逻辑增强语言模型代理

Agnieszka Mensfelt, Kostas Stathis, Vince Trencsenyi
{"title":"用于可信社会模拟的逻辑增强语言模型代理","authors":"Agnieszka Mensfelt, Kostas Stathis, Vince Trencsenyi","doi":"arxiv-2408.16081","DOIUrl":null,"url":null,"abstract":"We introduce the Logic-Enhanced Language Model Agents (LELMA) framework, a\nnovel approach to enhance the trustworthiness of social simulations that\nutilize large language models (LLMs). While LLMs have gained attention as\nagents for simulating human behaviour, their applicability in this role is\nlimited by issues such as inherent hallucinations and logical inconsistencies.\nLELMA addresses these challenges by integrating LLMs with symbolic AI, enabling\nlogical verification of the reasoning generated by LLMs. This verification\nprocess provides corrective feedback, refining the reasoning output. The\nframework consists of three main components: an LLM-Reasoner for producing\nstrategic reasoning, an LLM-Translator for mapping natural language reasoning\nto logic queries, and a Solver for evaluating these queries. This study focuses\non decision-making in game-theoretic scenarios as a model of human interaction.\nExperiments involving the Hawk-Dove game, Prisoner's Dilemma, and Stag Hunt\nhighlight the limitations of state-of-the-art LLMs, GPT-4 Omni and Gemini 1.0\nPro, in producing correct reasoning in these contexts. LELMA demonstrates high\naccuracy in error detection and improves the reasoning correctness of LLMs via\nself-refinement, particularly in GPT-4 Omni.","PeriodicalId":501208,"journal":{"name":"arXiv - CS - Logic in Computer Science","volume":"24 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Logic-Enhanced Language Model Agents for Trustworthy Social Simulations\",\"authors\":\"Agnieszka Mensfelt, Kostas Stathis, Vince Trencsenyi\",\"doi\":\"arxiv-2408.16081\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce the Logic-Enhanced Language Model Agents (LELMA) framework, a\\nnovel approach to enhance the trustworthiness of social simulations that\\nutilize large language models (LLMs). While LLMs have gained attention as\\nagents for simulating human behaviour, their applicability in this role is\\nlimited by issues such as inherent hallucinations and logical inconsistencies.\\nLELMA addresses these challenges by integrating LLMs with symbolic AI, enabling\\nlogical verification of the reasoning generated by LLMs. This verification\\nprocess provides corrective feedback, refining the reasoning output. The\\nframework consists of three main components: an LLM-Reasoner for producing\\nstrategic reasoning, an LLM-Translator for mapping natural language reasoning\\nto logic queries, and a Solver for evaluating these queries. This study focuses\\non decision-making in game-theoretic scenarios as a model of human interaction.\\nExperiments involving the Hawk-Dove game, Prisoner's Dilemma, and Stag Hunt\\nhighlight the limitations of state-of-the-art LLMs, GPT-4 Omni and Gemini 1.0\\nPro, in producing correct reasoning in these contexts. LELMA demonstrates high\\naccuracy in error detection and improves the reasoning correctness of LLMs via\\nself-refinement, particularly in GPT-4 Omni.\",\"PeriodicalId\":501208,\"journal\":{\"name\":\"arXiv - CS - Logic in Computer Science\",\"volume\":\"24 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Logic in Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.16081\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Logic in Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.16081","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

我们介绍了逻辑增强语言模型代理(LELMA)框架,这是一种增强使用大型语言模型(LLM)的社会模拟可信度的新方法。LELMA 将 LLM 与符号 AI 相结合,对 LLM 生成的推理进行逻辑验证,从而解决了这些难题。该验证过程提供纠正反馈,完善推理输出。该框架由三个主要部分组成:用于生成战略推理的 LLM 推理器、用于将自然语言推理映射为逻辑查询的 LLM 翻译器,以及用于评估这些查询的求解器。这项研究的重点是游戏理论场景中的决策,以此作为人机交互的模型。涉及鹰鸽游戏、囚徒困境和雄鹿狩猎的实验凸显了最先进的 LLM(GPT-4 Omni 和 Gemini 1.0Pro)在这些场景中生成正确推理的局限性。LELMA 在错误检测方面表现出很高的准确性,并提高了 LLMs(尤其是 GPT-4 Omni)通过自我提炼进行推理的正确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Logic-Enhanced Language Model Agents for Trustworthy Social Simulations
We introduce the Logic-Enhanced Language Model Agents (LELMA) framework, a novel approach to enhance the trustworthiness of social simulations that utilize large language models (LLMs). While LLMs have gained attention as agents for simulating human behaviour, their applicability in this role is limited by issues such as inherent hallucinations and logical inconsistencies. LELMA addresses these challenges by integrating LLMs with symbolic AI, enabling logical verification of the reasoning generated by LLMs. This verification process provides corrective feedback, refining the reasoning output. The framework consists of three main components: an LLM-Reasoner for producing strategic reasoning, an LLM-Translator for mapping natural language reasoning to logic queries, and a Solver for evaluating these queries. This study focuses on decision-making in game-theoretic scenarios as a model of human interaction. Experiments involving the Hawk-Dove game, Prisoner's Dilemma, and Stag Hunt highlight the limitations of state-of-the-art LLMs, GPT-4 Omni and Gemini 1.0 Pro, in producing correct reasoning in these contexts. LELMA demonstrates high accuracy in error detection and improves the reasoning correctness of LLMs via self-refinement, particularly in GPT-4 Omni.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信