Jeffrey N Rouder, Mahbod Mehrvarz, Martin Schnuerch
{"title":"The role of reliability in experiments.","authors":"Jeffrey N Rouder, Mahbod Mehrvarz, Martin Schnuerch","doi":"10.1111/bmsp.70042","DOIUrl":null,"url":null,"abstract":"<p><p>We are concerned about an emphasis on reliability for analysis of psychology experiments. Experiments have two elements of sample size: the number of individuals and the number of replicate trials within a task, and that complicates reliability measures. To account for these elements, we distinguish among three levels of analysis: (1) A foundational level that centers task properties without recourse to either element of sample size. An example statistic is intraclass correlation which is the proportion of variances without reference to sample sizes. (2) An intermediate level that centers the number of trials but not the number of individuals. An example statistic on this level is reliability which describes variabilities with reference to numbers of trials but not numbers of individuals. A final level centers both the numbers of individuals and trials. An example quantity is the uncertainty in a correlation coefficient, which, ideally, reflects sample size limits in individuals and trials. Reliability describes an intermediate level - neither useful for communicating foundational task properties nor interpreting correlations. We advocate that researchers consider all three levels and highlight the role of hierarchical models in doing so.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2026-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"British Journal of Mathematical & Statistical Psychology","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1111/bmsp.70042","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
We are concerned about an emphasis on reliability for analysis of psychology experiments. Experiments have two elements of sample size: the number of individuals and the number of replicate trials within a task, and that complicates reliability measures. To account for these elements, we distinguish among three levels of analysis: (1) A foundational level that centers task properties without recourse to either element of sample size. An example statistic is intraclass correlation which is the proportion of variances without reference to sample sizes. (2) An intermediate level that centers the number of trials but not the number of individuals. An example statistic on this level is reliability which describes variabilities with reference to numbers of trials but not numbers of individuals. A final level centers both the numbers of individuals and trials. An example quantity is the uncertainty in a correlation coefficient, which, ideally, reflects sample size limits in individuals and trials. Reliability describes an intermediate level - neither useful for communicating foundational task properties nor interpreting correlations. We advocate that researchers consider all three levels and highlight the role of hierarchical models in doing so.
期刊介绍:
The British Journal of Mathematical and Statistical Psychology publishes articles relating to areas of psychology which have a greater mathematical or statistical aspect of their argument than is usually acceptable to other journals including:
• mathematical psychology
• statistics
• psychometrics
• decision making
• psychophysics
• classification
• relevant areas of mathematics, computing and computer software
These include articles that address substantitive psychological issues or that develop and extend techniques useful to psychologists. New models for psychological processes, new approaches to existing data, critiques of existing models and improved algorithms for estimating the parameters of a model are examples of articles which may be favoured.