Katinka van der Kooij, Jeroen B J Smeets, Nina M van Mastrigt, Bernadette C M van Wijk
{"title":"在基于奖励的运动学习过程中,探索的迹象并不是独立的。","authors":"Katinka van der Kooij, Jeroen B J Smeets, Nina M van Mastrigt, Bernadette C M van Wijk","doi":"10.1007/s00221-025-07074-z","DOIUrl":null,"url":null,"abstract":"<p><p>Humans can learn various motor tasks based on binary reward feedback on whether a movement attempt was successful or not. Such 'reward-based motor learning' relies on exploiting successful motor commands and exploring different motor commands following failure. Most computational models of reward-based motor learning have formalized exploration as a random process, in which on each trial a random draw is taken from a normal distribution centred on zero. Whether human motor exploration is indeed random from trial to trial has not been tested yet. Here we tested in a force production task whether human motor exploration is random. To this end, we compared the proportion trial-to-trial force changes in the behavioural data that have the same sign to the proportion expected in random exploration. One group of participants practiced with an adaptive reward criterion, which keeps rewarded performance close to current performance, and the other group practiced with a fixed reward criterion in which current performance can be far from reward performance. In both groups, we found a proportion same-sign changes larger than predicted. In the Adaptive group, both the learning and proportion same-sign changes were consistent with model simulations for low values of random exploration, whereas in the Fixed group both the learning and proportion same-sign changes were inconsistent with model simulations based on random exploration. This suggests that some form of non-random motor exploration contributes to reward-based motor learning.</p>","PeriodicalId":12268,"journal":{"name":"Experimental Brain Research","volume":"243 5","pages":"117"},"PeriodicalIF":1.7000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12000264/pdf/","citationCount":"0","resultStr":"{\"title\":\"The sign of exploration during reward-based motor learning is not independent from trial to trial.\",\"authors\":\"Katinka van der Kooij, Jeroen B J Smeets, Nina M van Mastrigt, Bernadette C M van Wijk\",\"doi\":\"10.1007/s00221-025-07074-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Humans can learn various motor tasks based on binary reward feedback on whether a movement attempt was successful or not. Such 'reward-based motor learning' relies on exploiting successful motor commands and exploring different motor commands following failure. Most computational models of reward-based motor learning have formalized exploration as a random process, in which on each trial a random draw is taken from a normal distribution centred on zero. Whether human motor exploration is indeed random from trial to trial has not been tested yet. Here we tested in a force production task whether human motor exploration is random. To this end, we compared the proportion trial-to-trial force changes in the behavioural data that have the same sign to the proportion expected in random exploration. One group of participants practiced with an adaptive reward criterion, which keeps rewarded performance close to current performance, and the other group practiced with a fixed reward criterion in which current performance can be far from reward performance. In both groups, we found a proportion same-sign changes larger than predicted. In the Adaptive group, both the learning and proportion same-sign changes were consistent with model simulations for low values of random exploration, whereas in the Fixed group both the learning and proportion same-sign changes were inconsistent with model simulations based on random exploration. This suggests that some form of non-random motor exploration contributes to reward-based motor learning.</p>\",\"PeriodicalId\":12268,\"journal\":{\"name\":\"Experimental Brain Research\",\"volume\":\"243 5\",\"pages\":\"117\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2025-04-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12000264/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Experimental Brain Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00221-025-07074-z\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Experimental Brain Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00221-025-07074-z","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
The sign of exploration during reward-based motor learning is not independent from trial to trial.
Humans can learn various motor tasks based on binary reward feedback on whether a movement attempt was successful or not. Such 'reward-based motor learning' relies on exploiting successful motor commands and exploring different motor commands following failure. Most computational models of reward-based motor learning have formalized exploration as a random process, in which on each trial a random draw is taken from a normal distribution centred on zero. Whether human motor exploration is indeed random from trial to trial has not been tested yet. Here we tested in a force production task whether human motor exploration is random. To this end, we compared the proportion trial-to-trial force changes in the behavioural data that have the same sign to the proportion expected in random exploration. One group of participants practiced with an adaptive reward criterion, which keeps rewarded performance close to current performance, and the other group practiced with a fixed reward criterion in which current performance can be far from reward performance. In both groups, we found a proportion same-sign changes larger than predicted. In the Adaptive group, both the learning and proportion same-sign changes were consistent with model simulations for low values of random exploration, whereas in the Fixed group both the learning and proportion same-sign changes were inconsistent with model simulations based on random exploration. This suggests that some form of non-random motor exploration contributes to reward-based motor learning.
期刊介绍:
Founded in 1966, Experimental Brain Research publishes original contributions on many aspects of experimental research of the central and peripheral nervous system. The focus is on molecular, physiology, behavior, neurochemistry, developmental, cellular and molecular neurobiology, and experimental pathology relevant to general problems of cerebral function. The journal publishes original papers, reviews, and mini-reviews.