{"title":"Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models","authors":"Nunzio LoreSepehr, AlirezaSepehr, Ilami, Babak Heydari","doi":"arxiv-2408.05241","DOIUrl":null,"url":null,"abstract":"As the performance of larger, newer Large Language Models continues to\nimprove for strategic Theory of Mind (ToM) tasks, the demand for these state of\nthe art models increases commensurately. However, their deployment is costly\nboth in terms of processing power and time. In this paper, we investigate the\nfeasibility of creating smaller, simulation-ready agents by way of fine-tuning.\nTo do this, we present a large pre-trained model with 20 unique scenarios that\ncombine a social context with a social dilemma, recording its answers, and\nusing them for Q\\&A fine-tuning on a smaller model of the same family. Our\nfocus is on in-context game-theoretic decision-making, the same domain within\nwhich human interaction occurs and that requires both a theory of mind (or a\nsemblance thereof) and an understanding of social dynamics. We find that the\nfine-tuned smaller language model exhibited significant performance closer to\nthat of its larger relative, and that their improvements extended in areas and\ncontexts beyond the ones provided in the training examples. On average for all\ngames, through fine-tuning, the smaller model showed a \\%46 improvement in\naligning with the behavior of the larger model, with \\%100 representing\ncomplete alignment. This suggests that our pipeline represents an efficient\nmethod to transmit some form of theory of mind to smaller models, creating\nimproved and cheaply deployable algorithms in the process. Despite their\nsimplicity and their associated shortcomings and limitations, our findings\nrepresent a stepping stone in the pursuit and training of specialized models\nfor strategic and social decision making.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"61 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.05241","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As the performance of larger, newer Large Language Models continues to
improve for strategic Theory of Mind (ToM) tasks, the demand for these state of
the art models increases commensurately. However, their deployment is costly
both in terms of processing power and time. In this paper, we investigate the
feasibility of creating smaller, simulation-ready agents by way of fine-tuning.
To do this, we present a large pre-trained model with 20 unique scenarios that
combine a social context with a social dilemma, recording its answers, and
using them for Q\&A fine-tuning on a smaller model of the same family. Our
focus is on in-context game-theoretic decision-making, the same domain within
which human interaction occurs and that requires both a theory of mind (or a
semblance thereof) and an understanding of social dynamics. We find that the
fine-tuned smaller language model exhibited significant performance closer to
that of its larger relative, and that their improvements extended in areas and
contexts beyond the ones provided in the training examples. On average for all
games, through fine-tuning, the smaller model showed a \%46 improvement in
aligning with the behavior of the larger model, with \%100 representing
complete alignment. This suggests that our pipeline represents an efficient
method to transmit some form of theory of mind to smaller models, creating
improved and cheaply deployable algorithms in the process. Despite their
simplicity and their associated shortcomings and limitations, our findings
represent a stepping stone in the pursuit and training of specialized models
for strategic and social decision making.