Test-retest reliability of reinforcement learning parameters.

IF 4.6 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL

Behavior Research Methods Pub Date : 2024-08-01 Epub Date: 2023-09-08 DOI:10.3758/s13428-023-02203-4

Jessica V Schaaf, Laura Weidinger, Lucas Molleman, Wouter van den Bos

{"title":"Test-retest reliability of reinforcement learning parameters.","authors":"Jessica V Schaaf, Laura Weidinger, Lucas Molleman, Wouter van den Bos","doi":"10.3758/s13428-023-02203-4","DOIUrl":null,"url":null,"abstract":"<p><p>It has recently been suggested that parameter estimates of computational models can be used to understand individual differences at the process level. One area of research in which this approach, called computational phenotyping, has taken hold is computational psychiatry. One requirement for successful computational phenotyping is that behavior and parameters are stable over time. Surprisingly, the test-retest reliability of behavior and model parameters remains unknown for most experimental tasks and models. The present study seeks to close this gap by investigating the test-retest reliability of canonical reinforcement learning models in the context of two often-used learning paradigms: a two-armed bandit and a reversal learning task. We tested independent cohorts for the two tasks (N = 69 and N = 47) via an online testing platform with a between-test interval of five weeks. Whereas reliability was high for personality and cognitive measures (with ICCs ranging from .67 to .93), it was generally poor for the parameter estimates of the reinforcement learning models (with ICCs ranging from .02 to .52 for the bandit task and from .01 to .71 for the reversal learning task). Given that simulations indicated that our procedures could detect high test-retest reliability, this suggests that a significant proportion of the variability must be ascribed to the participants themselves. In support of that hypothesis, we show that mood (stress and happiness) can partly explain within-participant variability. Taken together, these results are critical for current practices in computational phenotyping and suggest that individual variability should be taken into account in the future development of the field.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":" ","pages":"4582-4599"},"PeriodicalIF":4.6000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11289054/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior Research Methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3758/s13428-023-02203-4","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/9/8 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

Abstract

It has recently been suggested that parameter estimates of computational models can be used to understand individual differences at the process level. One area of research in which this approach, called computational phenotyping, has taken hold is computational psychiatry. One requirement for successful computational phenotyping is that behavior and parameters are stable over time. Surprisingly, the test-retest reliability of behavior and model parameters remains unknown for most experimental tasks and models. The present study seeks to close this gap by investigating the test-retest reliability of canonical reinforcement learning models in the context of two often-used learning paradigms: a two-armed bandit and a reversal learning task. We tested independent cohorts for the two tasks (N = 69 and N = 47) via an online testing platform with a between-test interval of five weeks. Whereas reliability was high for personality and cognitive measures (with ICCs ranging from .67 to .93), it was generally poor for the parameter estimates of the reinforcement learning models (with ICCs ranging from .02 to .52 for the bandit task and from .01 to .71 for the reversal learning task). Given that simulations indicated that our procedures could detect high test-retest reliability, this suggests that a significant proportion of the variability must be ascribed to the participants themselves. In support of that hypothesis, we show that mood (stress and happiness) can partly explain within-participant variability. Taken together, these results are critical for current practices in computational phenotyping and suggest that individual variability should be taken into account in the future development of the field.

查看原文本刊更多论文

强化学习参数的重测可靠性。

最近有人提出，可以利用计算模型的参数估计来理解过程层面的个体差异。这种方法被称为 "计算表型"，其研究领域之一是计算精神病学。成功进行计算表型的一个条件是，行为和参数在一段时间内保持稳定。令人惊讶的是，对于大多数实验任务和模型来说，行为和模型参数的重复测试可靠性仍是未知数。本研究试图填补这一空白，在两个常用的学习范式（双臂匪徒和逆转学习任务）中研究典型强化学习模型的重复测试可靠性。我们通过在线测试平台对两个任务的独立组群（N = 69 和 N = 47）进行了测试，测试间隔为五周。虽然人格和认知测量的可靠性较高（ICC 在 0.67 到 0.93 之间），但强化学习模型参数估计的可靠性普遍较低（强盗任务的 ICC 在 0.02 到 0.52 之间，逆向学习任务的 ICC 在 0.01 到 0.71 之间）。鉴于模拟结果表明我们的程序可以检测到很高的测试-再测可靠性，这表明很大一部分变异性必须归因于参与者本身。为了支持这一假设，我们证明了情绪（压力和快乐）可以部分解释参与者内部的变异性。综上所述，这些结果对当前的计算表型实践至关重要，并表明在该领域的未来发展中应考虑个体变异性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Behavior Research Methods Multiple-

CiteScore

10.30

自引率

9.30%

发文量

266

期刊介绍： Behavior Research Methods publishes articles concerned with the methods, techniques, and instrumentation of research in experimental psychology. The journal focuses particularly on the use of computer technology in psychological research. An annual special issue is devoted to this field.