在概率任务中，选择之间的时间间隔与重复相同的决策有关

Supplement ... to the European journal of neuroscience Pub Date : 2019-05-24 DOI:10.1101/643965

Judyta Jabłońska, Łukasz Szumiec, P. Zieliński, J. Parkitna

{"title":"在概率任务中，选择之间的时间间隔与重复相同的决策有关","authors":"Judyta Jabłońska, Łukasz Szumiec, P. Zieliński, J. Parkitna","doi":"10.1101/643965","DOIUrl":null,"url":null,"abstract":"Reinforcement learning causes an action that yields a positive outcome more likely to be taken in the future. Here, we investigate how the time elapsed from an action affects subsequent decisions. Groups of C57BL6/J mice were housed in IntelliCages with access to water and chow ad libitum; they also had access to bottles with a reward: saccharin solution, alcohol or a mixture of the two. The probability of receiving a reward in two of the cage corners changed between 0.9 and 0.3 every 48 h over a period of ~33 days. As expected, in most animals, the odds of repeating a corner choice were increased if that choice was previously rewarded. Interestingly, the time elapsed from the previous choice also influenced the probability of repeating the choice, and this effect was independent of previous outcome. Behavioral data were fitted to a series of reinforcement learning models. Best fits were achieved when the reward prediction update was coupled with separate learning rates from positive and negative outcomes and additionally a “fictitious” update of the expected value of the nonselected choice. Additional inclusion of a time-dependent decay of the expected values improved the fit marginally in some cases.","PeriodicalId":79424,"journal":{"name":"Supplement ... to the European journal of neuroscience","volume":"2 1","pages":"2639 - 2654"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Time elapsed between choices in a probabilistic task correlates with repeating the same decision\",\"authors\":\"Judyta Jabłońska, Łukasz Szumiec, P. Zieliński, J. Parkitna\",\"doi\":\"10.1101/643965\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning causes an action that yields a positive outcome more likely to be taken in the future. Here, we investigate how the time elapsed from an action affects subsequent decisions. Groups of C57BL6/J mice were housed in IntelliCages with access to water and chow ad libitum; they also had access to bottles with a reward: saccharin solution, alcohol or a mixture of the two. The probability of receiving a reward in two of the cage corners changed between 0.9 and 0.3 every 48 h over a period of ~33 days. As expected, in most animals, the odds of repeating a corner choice were increased if that choice was previously rewarded. Interestingly, the time elapsed from the previous choice also influenced the probability of repeating the choice, and this effect was independent of previous outcome. Behavioral data were fitted to a series of reinforcement learning models. Best fits were achieved when the reward prediction update was coupled with separate learning rates from positive and negative outcomes and additionally a “fictitious” update of the expected value of the nonselected choice. Additional inclusion of a time-dependent decay of the expected values improved the fit marginally in some cases.\",\"PeriodicalId\":79424,\"journal\":{\"name\":\"Supplement ... to the European journal of neuroscience\",\"volume\":\"2 1\",\"pages\":\"2639 - 2654\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Supplement ... to the European journal of neuroscience\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/643965\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Supplement ... to the European journal of neuroscience","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/643965","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

强化学习导致未来更有可能采取产生积极结果的行动。在这里，我们将研究从一个操作经过的时间如何影响后续决策。C57BL6/J组小鼠饲养在智力笼中，可自由饮水和进食;他们还可以拿到有奖励的瓶子:糖精溶液、酒精或两者的混合物。在33天的时间里，在两个笼子角落获得奖励的概率在每48小时0.9到0.3之间变化。正如预期的那样，在大多数动物中，如果选择之前得到奖励，重复选择角落的几率会增加。有趣的是，从之前的选择中经过的时间也会影响重复选择的概率，而且这种影响与之前的结果无关。行为数据被拟合到一系列强化学习模型中。当奖励预测更新与积极和消极结果的单独学习率以及非选择选择的期望值的“虚构”更新相结合时，实现了最佳拟合。在某些情况下，期望值的随时间衰减的附加包含略微改善了拟合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Time elapsed between choices in a probabilistic task correlates with repeating the same decision

Reinforcement learning causes an action that yields a positive outcome more likely to be taken in the future. Here, we investigate how the time elapsed from an action affects subsequent decisions. Groups of C57BL6/J mice were housed in IntelliCages with access to water and chow ad libitum; they also had access to bottles with a reward: saccharin solution, alcohol or a mixture of the two. The probability of receiving a reward in two of the cage corners changed between 0.9 and 0.3 every 48 h over a period of ~33 days. As expected, in most animals, the odds of repeating a corner choice were increased if that choice was previously rewarded. Interestingly, the time elapsed from the previous choice also influenced the probability of repeating the choice, and this effect was independent of previous outcome. Behavioral data were fitted to a series of reinforcement learning models. Best fits were achieved when the reward prediction update was coupled with separate learning rates from positive and negative outcomes and additionally a “fictitious” update of the expected value of the nonselected choice. Additional inclusion of a time-dependent decay of the expected values improved the fit marginally in some cases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Supplement ... to the European journal of neuroscience

自引率

0.00%

发文量