{"title":"A simple statistical framework for small sample studies.","authors":"D Samuel Schwarzkopf, Zien Huang","doi":"10.1037/met0000710","DOIUrl":"https://doi.org/10.1037/met0000710","url":null,"abstract":"<p><p>Most studies in psychology, neuroscience, and life science research make inferences about how strong an effect is on average in the population. Yet, many research questions could instead be answered by testing for the universality of the phenomenon under investigation. By using reliable experimental designs that maximize both sensitivity and specificity of individual experiments, each participant or subject can be treated as an independent replication. This approach is common in certain subfields. To date, there is however no formal approach for calculating the evidential value of such small sample studies and to define a priori evidence thresholds that must be met to draw meaningful conclusions. Here we present such a framework, based on the ratio of binomial probabilities between a model assuming the universality of the phenomenon versus the null hypothesis that any incidence of the effect is sporadic. We demonstrate the benefits of this approach, which permits strong conclusions from samples as small as two to five participants and the flexibility of sequential testing. This approach will enable researchers to preregister experimental designs based on small samples and thus enhance the utility and credibility of such studies. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142786849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Why multiple hypothesis test corrections provide poor control of false positives in the real world.","authors":"Stanley E Lazic","doi":"10.1037/met0000678","DOIUrl":"https://doi.org/10.1037/met0000678","url":null,"abstract":"<p><p>Most scientific disciplines use significance testing to draw conclusions about experimental or observational data. This classical approach provides a theoretical guarantee for controlling the number of false positives across a set of hypothesis tests, making it an appealing framework for scientists seeking to limit the number of false effects or associations that they claim to observe. Unfortunately, this theoretical guarantee applies to few experiments, and the true false positive rate (FPR) is much higher. Scientists have plenty of freedom to choose the error rate to control, the tests to include in the adjustment, and the method of correction, making strong error control difficult to attain. In addition, hypotheses are often tested after finding unexpected relationships or patterns, the data are analyzed in several ways, and analyses may be run repeatedly as data accumulate. As a result, adjusted <i>p</i> values are too small, incorrect conclusions are often reached, and results are harder to reproduce. In the following, I argue why the FPR is rarely controlled meaningfully and why shrinking parameter estimates is preferable to <i>p</i> value adjustments. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142688594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Björn S Siepe, František Bartoš, Tim P Morris, Anne-Laure Boulesteix, Daniel W Heck, Samuel Pawel
{"title":"Simulation studies for methodological research in psychology: A standardized template for planning, preregistration, and reporting.","authors":"Björn S Siepe, František Bartoš, Tim P Morris, Anne-Laure Boulesteix, Daniel W Heck, Samuel Pawel","doi":"10.1037/met0000695","DOIUrl":"10.1037/met0000695","url":null,"abstract":"<p><p>Simulation studies are widely used for evaluating the performance of statistical methods in psychology. However, the quality of simulation studies can vary widely in terms of their design, execution, and reporting. In order to assess the quality of typical simulation studies in psychology, we reviewed 321 articles published in <i>Psychological Methods, Behavior Research Methods, and Multivariate Behavioral Research</i> in 2021 and 2022, among which 100/321 = 31.2% report a simulation study. We find that many articles do not provide complete and transparent information about key aspects of the study, such as justifications for the number of simulation repetitions, Monte Carlo uncertainty estimates, or code and data to reproduce the simulation studies. To address this problem, we provide a summary of the ADEMP (aims, data-generating mechanism, estimands and other targets, methods, performance measures) design and reporting framework from Morris et al. (2019) adapted to simulation studies in psychology. Based on this framework, we provide ADEMP-PreReg, a step-by-step template for researchers to use when designing, potentially preregistering, and reporting their simulation studies. We give formulae for estimating common performance measures, their Monte Carlo standard errors, and for calculating the number of simulation repetitions to achieve a desired Monte Carlo standard error. Finally, we give a detailed tutorial on how to apply the ADEMP framework in practice using an example simulation study on the evaluation of methods for the analysis of pre-post measurement experiments. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7616844/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142626859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hannah M Heister,Casper J Albers,Marie Wiberg,Marieke E Timmerman
{"title":"Item response theory-based continuous test norming.","authors":"Hannah M Heister,Casper J Albers,Marie Wiberg,Marieke E Timmerman","doi":"10.1037/met0000686","DOIUrl":"https://doi.org/10.1037/met0000686","url":null,"abstract":"In norm-referenced psychological testing, an individual's performance is expressed in relation to a reference population using a standardized score, like an intelligence quotient score. The reference population can depend on a continuous variable, like age. Current continuous norming methods transform the raw score into an age-dependent standardized score. Such methods have the shortcoming to solely rely on the raw test scores, ignoring valuable information from individual item responses. Instead of modeling the raw test scores, we propose modeling the item scores with a Bayesian two-parameter logistic (2PL) item response theory model with age-dependent mean and variance of the latent trait distribution, 2PL-norm for short. Norms are then derived using the estimated latent trait score and the age-dependent distribution parameters. Simulations show that 2PL-norms are overall more accurate than those from the most popular raw score-based norming methods cNORM and generalized additive models for location, scale, and shape (GAMLSS). Furthermore, the credible intervals of 2PL-norm exhibit clearly superior coverage over the confidence intervals of the raw score-based methods. The only issue of 2PL-norm is its slightly lower performance at the tails of the norms. Among the raw score-based norming methods, GAMLSS outperforms cNORM. For empirical practice this suggests the use of 2PL-norm, if the model assumptions hold. If not, or the interest is solely in the point estimates of the extreme trait positions, GAMLSS-based norming is a better alternative. The use of the 2PL-norm is illustrated and compared with GAMLSS and cNORM using empirical data, and code is provided, so that users can readily apply 2PL-norm to their normative data. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"10 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142439230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comments on the measurement of effect sizes for indirect effects in Bayesian analysis of variance.","authors":"Sang-June Park,Youjae Yi","doi":"10.1037/met0000706","DOIUrl":"https://doi.org/10.1037/met0000706","url":null,"abstract":"Bayesian analysis of variance (BANOVA), implemented through R packages, offers a Bayesian approach to analyzing experimental data. A tutorial in Psychological Methods extensively documents BANOVA. This note critically examines a method for evaluating mediation using partial eta-squared as an effect size measure within the BANOVA framework. We first identify an error in the formula for partial eta-squared and propose a corrected version. Subsequently, we discuss limitations in the interpretability of this effect size measure, drawing on previous research, and argue for its potential unsuitability in assessing indirect effects in mediation analysis. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"106 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142436375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Olmo R van den Akker,Marjan Bakker,Marcel A L M van Assen,Charlotte R Pennington,Leone Verweij,Mahmoud M Elsherif,Aline Claesen,Stefan D M Gaillard,Siu Kit Yeung,Jan-Luca Frankenberger,Kai Krautter,Jamie P Cockcroft,Katharina S Kreuer,Thomas Rhys Evans,Frédérique M Heppel,Sarah F Schoch,Max Korbmacher,Yuki Yamada,Nihan Albayrak-Aydemir,Shilaan Alzahawi,Alexandra Sarafoglou,Maksim M Sitnikov,Filip Děchtěrenko,Sophia Wingen,Sandra Grinschgl,Helena Hartmann,Suzanne L K Stewart,Cátia M F de Oliveira,Sarah Ashcroft-Jones,Bradley J Baker,Jelte M Wicherts
{"title":"The potential of preregistration in psychology: Assessing preregistration producibility and preregistration-study consistency.","authors":"Olmo R van den Akker,Marjan Bakker,Marcel A L M van Assen,Charlotte R Pennington,Leone Verweij,Mahmoud M Elsherif,Aline Claesen,Stefan D M Gaillard,Siu Kit Yeung,Jan-Luca Frankenberger,Kai Krautter,Jamie P Cockcroft,Katharina S Kreuer,Thomas Rhys Evans,Frédérique M Heppel,Sarah F Schoch,Max Korbmacher,Yuki Yamada,Nihan Albayrak-Aydemir,Shilaan Alzahawi,Alexandra Sarafoglou,Maksim M Sitnikov,Filip Děchtěrenko,Sophia Wingen,Sandra Grinschgl,Helena Hartmann,Suzanne L K Stewart,Cátia M F de Oliveira,Sarah Ashcroft-Jones,Bradley J Baker,Jelte M Wicherts","doi":"10.1037/met0000687","DOIUrl":"https://doi.org/10.1037/met0000687","url":null,"abstract":"Study preregistration has become increasingly popular in psychology, but its potential to restrict researcher degrees of freedom has not yet been empirically verified. We used an extensive protocol to assess the producibility (i.e., the degree to which a study can be properly conducted based on the available information) of preregistrations and the consistency between preregistrations and their corresponding papers for 300 psychology studies. We found that preregistrations often lack methodological details and that undisclosed deviations from preregistered plans are frequent. These results highlight that biases due to researcher degrees of freedom remain possible in many preregistered studies. More comprehensive registration templates typically yielded more producible preregistrations. We did not find that the producibility and consistency of preregistrations differed over time or between original and replication studies. Furthermore, we found that operationalizations of variables were generally preregistered more producible and consistently than other study parts. Inconsistencies between preregistrations and published studies were mainly encountered for data collection procedures, statistical models, and exclusion criteria. Our results indicate that, to unlock the full potential of preregistration, researchers in psychology should aim to write more producible preregistrations, adhere to these preregistrations more faithfully, and more transparently report any deviations from their preregistrations. This could be facilitated by training and education to improve preregistration skills, as well as the development of more comprehensive templates. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"124 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142436379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lagged multidimensional recurrence quantification analysis for determining leader-follower relationships within multidimensional time series.","authors":"Alon Tomashin,Ilanit Gordon,Giuseppe Leonardi,Yair Berson,Nir Milstein,Matthias Ziegler,Ursula Hess,Sebastian Wallot","doi":"10.1037/met0000691","DOIUrl":"https://doi.org/10.1037/met0000691","url":null,"abstract":"The current article introduces lagged multidimensional recurrence quantification analysis. The method is an extension of multidimensional recurrence quantification analysis and allows to quantify the joint dynamics of multivariate time series and to investigate leader-follower relationships in behavioral and physiological data. Moreover, the method enables the quantification of the joint dynamics of a group, when such leader-follower relationships are taken into account. We first provide a formal presentation of the method, and then apply it to synthetic data, as well as data sets from joint action research, investigating the shared dynamics of facial expression and beats-per-minute recordings within different groups. A wrapper function is included, for applying the method together with the \"crqa\" package in R. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"85 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142436376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Harvesting heterogeneity: Selective expertise versus machine learning.","authors":"Rumen Iliev,Alex Filipowicz,Francine Chen,Nikos Arechiga,Scott Carter,Emily Sumner,Totte Harinen,Kate Sieck,Kent Lyons,Charlene Wu","doi":"10.1037/met0000640","DOIUrl":"https://doi.org/10.1037/met0000640","url":null,"abstract":"The heterogeneity of outcomes in behavioral research has long been perceived as a challenge for the validity of various theoretical models. More recently, however, researchers have started perceiving heterogeneity as something that needs to be not only acknowledged but also actively addressed, particularly in applied research. A serious challenge, however, is that classical psychological methods are not well suited for making practical recommendations when heterogeneous outcomes are expected. In this article, we argue that heterogeneity requires a separation between basic and applied behavioral methods, and between different types of behavioral expertise. We propose a novel framework for evaluating behavioral expertise and suggest that selective expertise can easily be automated via various machine learning methods. We illustrate the value of our framework via an empirical study of the preferences towards battery electric vehicles. Our results suggest that a basic multiarm bandit algorithm vastly outperforms human expertise in selecting the best interventions. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"23 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142386319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How to conduct an integrative mixed methods meta-analysis: A tutorial for the systematic review of quantitative and qualitative evidence.","authors":"Heidi M Levitt","doi":"10.1037/met0000675","DOIUrl":"https://doi.org/10.1037/met0000675","url":null,"abstract":"<p><p>This article is a guide on how to conduct mixed methods meta-analyses (sometimes called mixed methods systematic reviews, integrative meta-analyses, or integrative meta-syntheses), using an integrative approach. These aggregative methods allow researchers to synthesize qualitative and quantitative findings from a research literature in order to benefit from the strengths of both forms of analysis. The article articulates distinctions in how qualitative and quantitative methodologies work with variation to develop a coherent theoretical basis for their integration. In advancing this methodological approach to integrative mixed methods meta-analysis (IMMMA), I provide rationales for procedural decisions that support methodological integrity and address prior misconceptions that may explain why these methods have not been as commonly used as might be expected. Features of questions and subject matters that lead them to be amenable to this research approach are considered. The steps to conducting an IMMMA then are described, with illustrative examples, and in a manner open to the use of a range of qualitative and quantitative meta-analytic approaches. These steps include the development of research aims, the selection of primary research articles, the generation of units for analysis, and the development of themes and findings. The tutorial provides guidance on how to develop IMMMA findings that have methodological integrity and are based upon the appreciation of the distinctive approaches to modeling variation in quantitative and qualitative methodologies. The article concludes with guidance for report writing and developing principles for practice. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142366352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Psychological methodsPub Date : 2024-10-01Epub Date: 2023-04-27DOI: 10.1037/met0000564
Wen Wei Loh, Dongning Ren
{"title":"Data-driven covariate selection for confounding adjustment by focusing on the stability of the effect estimator.","authors":"Wen Wei Loh, Dongning Ren","doi":"10.1037/met0000564","DOIUrl":"10.1037/met0000564","url":null,"abstract":"<p><p>Valid inference of cause-and-effect relations in observational studies necessitates adjusting for common causes of the focal predictor (i.e., treatment) and the outcome. When such common causes, henceforth termed confounders, remain unadjusted for, they generate spurious correlations that lead to biased causal effect estimates. But routine adjustment for all available covariates, when only a subset are truly confounders, is known to yield potentially inefficient and unstable estimators. In this article, we introduce a data-driven confounder selection strategy that focuses on stable estimation of the treatment effect. The approach exploits the causal knowledge that after adjusting for confounders to eliminate all confounding biases, adding any remaining non-confounding covariates associated with only treatment or outcome, but not both, should not systematically change the effect estimator. The strategy proceeds in two steps. First, we prioritize covariates for adjustment by probing how strongly each covariate is associated with treatment and outcome. Next, we gauge the stability of the effect estimator by evaluating its trajectory adjusting for different covariate subsets. The smallest subset that yields a stable effect estimate is then selected. Thus, the strategy offers direct insight into the (in)sensitivity of the effect estimator to the chosen covariates for adjustment. The ability to correctly select confounders and yield valid causal inferences following data-driven covariate selection is evaluated empirically using extensive simulation studies. Furthermore, we compare the introduced method empirically with routine variable selection methods. Finally, we demonstrate the procedure using two publicly available real-world datasets. A step-by-step practical guide with user-friendly R functions is included. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"947-966"},"PeriodicalIF":7.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9356535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}