达到活力跟踪学习预测误差。

bioRxiv : the preprint server for biology Pub Date : 2025-05-20 DOI:10.1101/2025.03.24.645035

Colin C Korbisch, Alaa A Ahmed

{"title":"达到活力跟踪学习预测误差。","authors":"Colin C Korbisch, Alaa A Ahmed","doi":"10.1101/2025.03.24.645035","DOIUrl":null,"url":null,"abstract":"Movement vigor across multiple modalities increases with reward, suggesting that the neural circuits that represent value influence the control of movement. Dopaminergic neuron (DAN) activity has been suggested as the potential mediator of this response. If DAN activity is the bridge between value and vigor, then vigor should track canonical mediators of DAN activity, namely learning signals in the form of reward expectation and reward prediction error. Here we ask if a similar time-locked response is present in vigor of reaching movements. We explore this link by leveraging the known phasic dopaminergic response to stochastic rewards, where activity is modulated by both reward expectation at cue and the reward prediction error at feedback. We used probabilistic rewards to create a reaching task rich in reward expectation, reward prediction error, and learning. In one experiment, target reward probabilities were explicitly stated, and in the other, were left unknown and to be learned by the participants. We included two stochastic rewards (probabilities 33% and 66%) and two deterministic ones (probabilities 100% and 0%). In both experiments, outgoing peak velocity increased with increasing reward expectation. Furthermore, we observed a short-latency response in the vigor of the ongoing movement, that tracked reward prediction error: either invigorating or enervating velocity consistent with the sign and magnitude of the error. Reaching kinematics also revealed the value-update process in a trial-to-trial fashion, similar to the effect of prediction error signals typical in dopamine-mediated striatal phasic activity. Lastly, reach vigor increased with reward history over trials, mirroring the motivational effects often linked to fluctuating dopamine levels. Taken together, our results highlight the link between known short-latency dopaminergic learning signals and the invigoration of movement, not only at the time of cue presentation and movement initiation, but during an ongoing movement immediately after feedback is provided.","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11974846/pdf/","citationCount":"0","resultStr":"{\"title\":\"Rapid Dopaminergic Signatures in Movement: Reach Vigor Reflects Reward Prediction Error and Learned Expectation.\",\"authors\":\"Colin C Korbisch, Alaa A Ahmed\",\"doi\":\"10.1101/2025.03.24.645035\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Movement vigor across multiple modalities increases with reward, suggesting that the neural circuits that represent value influence the control of movement. Dopaminergic neuron (DAN) activity has been suggested as the potential mediator of this response. If DAN activity is the bridge between value and vigor, then vigor should track canonical mediators of DAN activity, namely learning signals in the form of reward expectation and reward prediction error. Here we ask if a similar time-locked response is present in vigor of reaching movements. We explore this link by leveraging the known phasic dopaminergic response to stochastic rewards, where activity is modulated by both reward expectation at cue and the reward prediction error at feedback. We used probabilistic rewards to create a reaching task rich in reward expectation, reward prediction error, and learning. In one experiment, target reward probabilities were explicitly stated, and in the other, were left unknown and to be learned by the participants. We included two stochastic rewards (probabilities 33% and 66%) and two deterministic ones (probabilities 100% and 0%). In both experiments, outgoing peak velocity increased with increasing reward expectation. Furthermore, we observed a short-latency response in the vigor of the ongoing movement, that tracked reward prediction error: either invigorating or enervating velocity consistent with the sign and magnitude of the error. Reaching kinematics also revealed the value-update process in a trial-to-trial fashion, similar to the effect of prediction error signals typical in dopamine-mediated striatal phasic activity. Lastly, reach vigor increased with reward history over trials, mirroring the motivational effects often linked to fluctuating dopamine levels. Taken together, our results highlight the link between known short-latency dopaminergic learning signals and the invigoration of movement, not only at the time of cue presentation and movement initiation, but during an ongoing movement immediately after feedback is provided.\",\"PeriodicalId\":519960,\"journal\":{\"name\":\"bioRxiv : the preprint server for biology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11974846/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv : the preprint server for biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2025.03.24.645035\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2025.03.24.645035","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

多模态的运动活力随着奖赏的增加而增加，这表明代表价值的神经回路影响运动的控制。基底神经节中的多巴胺能神经元（DAN）活性被认为是这种反应的潜在介质。如果DAN活动是价值和活力之间的桥梁，那么活力应该跟踪该活动的规范中介，即奖励期望和奖励预测误差。在这里，我们要问的是，在达到运动的活力中是否存在类似的时间锁定反应。我们通过利用已知的随机奖励的阶段性多巴胺能反应来探索这种联系，其中活动由提示时的奖励期望和反馈时的预测误差调节。我们使用概率奖励来创建一个包含奖励期望、奖励预测误差和学习的到达任务。在一个实验中，目标奖励概率是明确的，而在另一个实验中，目标奖励概率是未知的，由参与者学习。我们包含了两个随机奖励（概率为33%和66%）和两个确定性奖励（概率为100%和0%）。在两个实验中，输出峰值速度都随着奖励预期的增加而增加。此外，我们观察到正在进行的运动活力的短潜伏期反应，跟踪奖励预测误差：与误差的符号和大小一致的激活或削弱速度。到达运动学还揭示了以试验到试验的方式进行的值更新过程，类似于多巴胺介导的纹状体相活动中典型的预测误差信号的影响。最后，触达活力随着奖励历史的增加而增加，反映了通常与多巴胺水平波动有关的激励效应。综上所述，我们的研究结果表明，已知的短潜伏期奖励信号与离散和持续运动之间存在着微妙的联系。新的和值得注意的：先前的研究已经证明了奖励对运动的激励作用。越来越多的证据表明，这可以通过中脑多巴胺瞬变来解释。在这里，我们证明了达到活力跟踪学习和动机的典型变量跨越时间尺度，从毫秒到分钟。速度受奖励预期、奖励预测误差和奖励率的调节，这些关键变量也与纹状体多巴胺能波动有关。这些结果指出了一种潜在的神经机制，通过这种机制多巴胺可以影响决策和运动控制，并支持了基于奖励的运动激活部分受多巴胺能回路影响的命题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Rapid Dopaminergic Signatures in Movement: Reach Vigor Reflects Reward Prediction Error and Learned Expectation.

Movement vigor across multiple modalities increases with reward, suggesting that the neural circuits that represent value influence the control of movement. Dopaminergic neuron (DAN) activity has been suggested as the potential mediator of this response. If DAN activity is the bridge between value and vigor, then vigor should track canonical mediators of DAN activity, namely learning signals in the form of reward expectation and reward prediction error. Here we ask if a similar time-locked response is present in vigor of reaching movements. We explore this link by leveraging the known phasic dopaminergic response to stochastic rewards, where activity is modulated by both reward expectation at cue and the reward prediction error at feedback. We used probabilistic rewards to create a reaching task rich in reward expectation, reward prediction error, and learning. In one experiment, target reward probabilities were explicitly stated, and in the other, were left unknown and to be learned by the participants. We included two stochastic rewards (probabilities 33% and 66%) and two deterministic ones (probabilities 100% and 0%). In both experiments, outgoing peak velocity increased with increasing reward expectation. Furthermore, we observed a short-latency response in the vigor of the ongoing movement, that tracked reward prediction error: either invigorating or enervating velocity consistent with the sign and magnitude of the error. Reaching kinematics also revealed the value-update process in a trial-to-trial fashion, similar to the effect of prediction error signals typical in dopamine-mediated striatal phasic activity. Lastly, reach vigor increased with reward history over trials, mirroring the motivational effects often linked to fluctuating dopamine levels. Taken together, our results highlight the link between known short-latency dopaminergic learning signals and the invigoration of movement, not only at the time of cue presentation and movement initiation, but during an ongoing movement immediately after feedback is provided.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

bioRxiv : the preprint server for biology

自引率

0.00%

发文量