大模型战略思维,小模型效率:在大型语言模型中传递思维理论

Nunzio LoreSepehr, AlirezaSepehr, Ilami, Babak Heydari
{"title":"大模型战略思维,小模型效率:在大型语言模型中传递思维理论","authors":"Nunzio LoreSepehr, AlirezaSepehr, Ilami, Babak Heydari","doi":"arxiv-2408.05241","DOIUrl":null,"url":null,"abstract":"As the performance of larger, newer Large Language Models continues to\nimprove for strategic Theory of Mind (ToM) tasks, the demand for these state of\nthe art models increases commensurately. However, their deployment is costly\nboth in terms of processing power and time. In this paper, we investigate the\nfeasibility of creating smaller, simulation-ready agents by way of fine-tuning.\nTo do this, we present a large pre-trained model with 20 unique scenarios that\ncombine a social context with a social dilemma, recording its answers, and\nusing them for Q\\&A fine-tuning on a smaller model of the same family. Our\nfocus is on in-context game-theoretic decision-making, the same domain within\nwhich human interaction occurs and that requires both a theory of mind (or a\nsemblance thereof) and an understanding of social dynamics. We find that the\nfine-tuned smaller language model exhibited significant performance closer to\nthat of its larger relative, and that their improvements extended in areas and\ncontexts beyond the ones provided in the training examples. On average for all\ngames, through fine-tuning, the smaller model showed a \\%46 improvement in\naligning with the behavior of the larger model, with \\%100 representing\ncomplete alignment. This suggests that our pipeline represents an efficient\nmethod to transmit some form of theory of mind to smaller models, creating\nimproved and cheaply deployable algorithms in the process. Despite their\nsimplicity and their associated shortcomings and limitations, our findings\nrepresent a stepping stone in the pursuit and training of specialized models\nfor strategic and social decision making.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"61 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models\",\"authors\":\"Nunzio LoreSepehr, AlirezaSepehr, Ilami, Babak Heydari\",\"doi\":\"arxiv-2408.05241\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the performance of larger, newer Large Language Models continues to\\nimprove for strategic Theory of Mind (ToM) tasks, the demand for these state of\\nthe art models increases commensurately. However, their deployment is costly\\nboth in terms of processing power and time. In this paper, we investigate the\\nfeasibility of creating smaller, simulation-ready agents by way of fine-tuning.\\nTo do this, we present a large pre-trained model with 20 unique scenarios that\\ncombine a social context with a social dilemma, recording its answers, and\\nusing them for Q\\\\&A fine-tuning on a smaller model of the same family. Our\\nfocus is on in-context game-theoretic decision-making, the same domain within\\nwhich human interaction occurs and that requires both a theory of mind (or a\\nsemblance thereof) and an understanding of social dynamics. We find that the\\nfine-tuned smaller language model exhibited significant performance closer to\\nthat of its larger relative, and that their improvements extended in areas and\\ncontexts beyond the ones provided in the training examples. On average for all\\ngames, through fine-tuning, the smaller model showed a \\\\%46 improvement in\\naligning with the behavior of the larger model, with \\\\%100 representing\\ncomplete alignment. This suggests that our pipeline represents an efficient\\nmethod to transmit some form of theory of mind to smaller models, creating\\nimproved and cheaply deployable algorithms in the process. Despite their\\nsimplicity and their associated shortcomings and limitations, our findings\\nrepresent a stepping stone in the pursuit and training of specialized models\\nfor strategic and social decision making.\",\"PeriodicalId\":501168,\"journal\":{\"name\":\"arXiv - CS - Emerging Technologies\",\"volume\":\"61 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Emerging Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.05241\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.05241","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

随着大型、更新的大型语言模型在战略心智理论(ToM)任务中的性能不断提高,对这些最新模型的需求也相应增加。然而,部署这些模型需要耗费大量的处理能力和时间。为此,我们提出了一个预先训练好的大型模型,该模型包含 20 个独特的场景,将社会环境与社会困境结合在一起,记录其答案,并将其用于在同一系列的较小模型上进行 Q&A 微调。我们的重点是情境中的博弈论决策,这也是人类互动的相同领域,既需要心智理论(或其组合),也需要对社会动态的理解。我们发现,经过微调的较小语言模型表现出的性能显著接近于较大语言模型,而且它们的改进超出了训练示例所提供的领域和语境。平均而言,在所有游戏中,通过微调,较小模型在与较大模型的行为对齐方面有了\%46的改进,其中\%100代表完全对齐。这表明,我们的管道是一种高效的方法,可以将某种形式的思维理论传输到较小的模型中,在此过程中创造出改进的、可廉价部署的算法。尽管我们的研究很简单,也存在相关的缺点和局限性,但我们的研究成果代表了追求和训练用于战略和社会决策的专门模型的一块垫脚石。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models
As the performance of larger, newer Large Language Models continues to improve for strategic Theory of Mind (ToM) tasks, the demand for these state of the art models increases commensurately. However, their deployment is costly both in terms of processing power and time. In this paper, we investigate the feasibility of creating smaller, simulation-ready agents by way of fine-tuning. To do this, we present a large pre-trained model with 20 unique scenarios that combine a social context with a social dilemma, recording its answers, and using them for Q\&A fine-tuning on a smaller model of the same family. Our focus is on in-context game-theoretic decision-making, the same domain within which human interaction occurs and that requires both a theory of mind (or a semblance thereof) and an understanding of social dynamics. We find that the fine-tuned smaller language model exhibited significant performance closer to that of its larger relative, and that their improvements extended in areas and contexts beyond the ones provided in the training examples. On average for all games, through fine-tuning, the smaller model showed a \%46 improvement in aligning with the behavior of the larger model, with \%100 representing complete alignment. This suggests that our pipeline represents an efficient method to transmit some form of theory of mind to smaller models, creating improved and cheaply deployable algorithms in the process. Despite their simplicity and their associated shortcomings and limitations, our findings represent a stepping stone in the pursuit and training of specialized models for strategic and social decision making.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信