大型语言模型中的相对值编码：一项多任务、多模型的研究。

Q1 Social Sciences

Open Mind Pub Date : 2025-05-09 eCollection Date: 2025-01-01 DOI:10.1162/opmi_a_00209

William M Hayes, Nicolas Yax, Stefano Palminteri

{"title":"大型语言模型中的相对值编码：一项多任务、多模型的研究。","authors":"William M Hayes, Nicolas Yax, Stefano Palminteri","doi":"10.1162/opmi_a_00209","DOIUrl":null,"url":null,"abstract":"In-context learning enables large language models (LLMs) to perform a variety of tasks, including solving reinforcement learning (RL) problems. Given their potential use as (autonomous) decision-making agents, it is important to understand how these models behave in RL tasks and the extent to which they are susceptible to biases. Motivated by the fact that, in humans, it has been widely documented that the value of a choice outcome depends on how it compares to other local outcomes, the present study focuses on whether similar value encoding biases apply to LLMs. Results from experiments with multiple bandit tasks and models show that LLMs exhibit behavioral signatures of relative value encoding. Adding explicit outcome comparisons to the prompt magnifies the bias, impairing the ability of LLMs to generalize from the outcomes presented in-context to new choice problems, similar to effects observed in humans. Computational cognitive modeling reveals that LLM behavior is well-described by a simple RL algorithm that incorporates relative values at the outcome encoding stage. Lastly, we present preliminary evidence that the observed biases are not limited to fine-tuned LLMs, and that relative value processing is detectable in the final hidden layer activations of a raw, pretrained model. These findings have important implications for the use of LLMs in decision-making applications.","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"9 ","pages":"709-725"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12140570/pdf/","citationCount":"0","resultStr":"{\"title\":\"Relative Value Encoding in Large Language Models: A Multi-Task, Multi-Model Investigation.\",\"authors\":\"William M Hayes, Nicolas Yax, Stefano Palminteri\",\"doi\":\"10.1162/opmi_a_00209\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In-context learning enables large language models (LLMs) to perform a variety of tasks, including solving reinforcement learning (RL) problems. Given their potential use as (autonomous) decision-making agents, it is important to understand how these models behave in RL tasks and the extent to which they are susceptible to biases. Motivated by the fact that, in humans, it has been widely documented that the value of a choice outcome depends on how it compares to other local outcomes, the present study focuses on whether similar value encoding biases apply to LLMs. Results from experiments with multiple bandit tasks and models show that LLMs exhibit behavioral signatures of relative value encoding. Adding explicit outcome comparisons to the prompt magnifies the bias, impairing the ability of LLMs to generalize from the outcomes presented in-context to new choice problems, similar to effects observed in humans. Computational cognitive modeling reveals that LLM behavior is well-described by a simple RL algorithm that incorporates relative values at the outcome encoding stage. Lastly, we present preliminary evidence that the observed biases are not limited to fine-tuned LLMs, and that relative value processing is detectable in the final hidden layer activations of a raw, pretrained model. These findings have important implications for the use of LLMs in decision-making applications.\",\"PeriodicalId\":32558,\"journal\":{\"name\":\"Open Mind\",\"volume\":\"9 \",\"pages\":\"709-725\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-05-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12140570/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Open Mind\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1162/opmi_a_00209\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Open Mind","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1162/opmi_a_00209","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}

引用次数: 0

摘要

上下文学习使大型语言模型（llm）能够执行各种任务，包括解决强化学习（RL）问题。考虑到它们作为（自主）决策代理的潜在用途，了解这些模型在强化学习任务中的表现以及它们容易受到偏见影响的程度是很重要的。在人类中，一个选择结果的价值取决于它与其他局部结果的比较，这一事实已被广泛记录，因此，本研究的重点是类似的价值编码偏差是否适用于法学硕士。多个强盗任务和模型的实验结果表明，llm表现出相对价值编码的行为特征。在提示中加入明确的结果比较会放大偏差，损害法学硕士从情境中呈现的结果推广到新选择问题的能力，类似于在人类中观察到的效果。计算认知模型表明，LLM行为可以通过一个简单的RL算法很好地描述，该算法在结果编码阶段包含相对值。最后，我们提出了初步证据，表明观察到的偏差并不局限于微调的llm，并且在原始预训练模型的最终隐藏层激活中可以检测到相对值处理。这些发现对法学硕士在决策应用中的应用具有重要意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Relative Value Encoding in Large Language Models: A Multi-Task, Multi-Model Investigation.

In-context learning enables large language models (LLMs) to perform a variety of tasks, including solving reinforcement learning (RL) problems. Given their potential use as (autonomous) decision-making agents, it is important to understand how these models behave in RL tasks and the extent to which they are susceptible to biases. Motivated by the fact that, in humans, it has been widely documented that the value of a choice outcome depends on how it compares to other local outcomes, the present study focuses on whether similar value encoding biases apply to LLMs. Results from experiments with multiple bandit tasks and models show that LLMs exhibit behavioral signatures of relative value encoding. Adding explicit outcome comparisons to the prompt magnifies the bias, impairing the ability of LLMs to generalize from the outcomes presented in-context to new choice problems, similar to effects observed in humans. Computational cognitive modeling reveals that LLM behavior is well-described by a simple RL algorithm that incorporates relative values at the outcome encoding stage. Lastly, we present preliminary evidence that the observed biases are not limited to fine-tuned LLMs, and that relative value processing is detectable in the final hidden layer activations of a raw, pretrained model. These findings have important implications for the use of LLMs in decision-making applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Open Mind Social Sciences-Linguistics and Language

CiteScore

3.20

自引率

0.00%

发文量

审稿时长

53 weeks