Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models

arXiv - CS - Emerging Technologies Pub Date : 2024-08-05 DOI:arxiv-2408.05241

Nunzio LoreSepehr, AlirezaSepehr, Ilami, Babak Heydari

{"title":"Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models","authors":"Nunzio LoreSepehr, AlirezaSepehr, Ilami, Babak Heydari","doi":"arxiv-2408.05241","DOIUrl":null,"url":null,"abstract":"As the performance of larger, newer Large Language Models continues to\nimprove for strategic Theory of Mind (ToM) tasks, the demand for these state of\nthe art models increases commensurately. However, their deployment is costly\nboth in terms of processing power and time. In this paper, we investigate the\nfeasibility of creating smaller, simulation-ready agents by way of fine-tuning.\nTo do this, we present a large pre-trained model with 20 unique scenarios that\ncombine a social context with a social dilemma, recording its answers, and\nusing them for Q\\&A fine-tuning on a smaller model of the same family. Our\nfocus is on in-context game-theoretic decision-making, the same domain within\nwhich human interaction occurs and that requires both a theory of mind (or a\nsemblance thereof) and an understanding of social dynamics. We find that the\nfine-tuned smaller language model exhibited significant performance closer to\nthat of its larger relative, and that their improvements extended in areas and\ncontexts beyond the ones provided in the training examples. On average for all\ngames, through fine-tuning, the smaller model showed a \\%46 improvement in\naligning with the behavior of the larger model, with \\%100 representing\ncomplete alignment. This suggests that our pipeline represents an efficient\nmethod to transmit some form of theory of mind to smaller models, creating\nimproved and cheaply deployable algorithms in the process. Despite their\nsimplicity and their associated shortcomings and limitations, our findings\nrepresent a stepping stone in the pursuit and training of specialized models\nfor strategic and social decision making.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"61 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.05241","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

As the performance of larger, newer Large Language Models continues to improve for strategic Theory of Mind (ToM) tasks, the demand for these state of the art models increases commensurately. However, their deployment is costly both in terms of processing power and time. In this paper, we investigate the feasibility of creating smaller, simulation-ready agents by way of fine-tuning. To do this, we present a large pre-trained model with 20 unique scenarios that combine a social context with a social dilemma, recording its answers, and using them for Q\&A fine-tuning on a smaller model of the same family. Our focus is on in-context game-theoretic decision-making, the same domain within which human interaction occurs and that requires both a theory of mind (or a semblance thereof) and an understanding of social dynamics. We find that the fine-tuned smaller language model exhibited significant performance closer to that of its larger relative, and that their improvements extended in areas and contexts beyond the ones provided in the training examples. On average for all games, through fine-tuning, the smaller model showed a \%46 improvement in aligning with the behavior of the larger model, with \%100 representing complete alignment. This suggests that our pipeline represents an efficient method to transmit some form of theory of mind to smaller models, creating improved and cheaply deployable algorithms in the process. Despite their simplicity and their associated shortcomings and limitations, our findings represent a stepping stone in the pursuit and training of specialized models for strategic and social decision making.

查看原文本刊更多论文

大模型战略思维，小模型效率：在大型语言模型中传递思维理论

随着大型、更新的大型语言模型在战略心智理论（ToM）任务中的性能不断提高，对这些最新模型的需求也相应增加。然而，部署这些模型需要耗费大量的处理能力和时间。为此，我们提出了一个预先训练好的大型模型，该模型包含 20 个独特的场景，将社会环境与社会困境结合在一起，记录其答案，并将其用于在同一系列的较小模型上进行 Q&A 微调。我们的重点是情境中的博弈论决策，这也是人类互动的相同领域，既需要心智理论（或其组合），也需要对社会动态的理解。我们发现，经过微调的较小语言模型表现出的性能显著接近于较大语言模型，而且它们的改进超出了训练示例所提供的领域和语境。平均而言，在所有游戏中，通过微调，较小模型在与较大模型的行为对齐方面有了\%46的改进，其中\%100代表完全对齐。这表明，我们的管道是一种高效的方法，可以将某种形式的思维理论传输到较小的模型中，在此过程中创造出改进的、可廉价部署的算法。尽管我们的研究很简单，也存在相关的缺点和局限性，但我们的研究成果代表了追求和训练用于战略和社会决策的专门模型的一块垫脚石。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Emerging Technologies

自引率

0.00%

发文量