Kimberly A Barchard, James M Carroll, Shawn Reynolds, James A Russell
{"title":"Testing bipolarity.","authors":"Kimberly A Barchard, James M Carroll, Shawn Reynolds, James A Russell","doi":"10.1037/met0000707","DOIUrl":"https://doi.org/10.1037/met0000707","url":null,"abstract":"<p><p>Many psychological dimensions seem bipolar (e.g., happy-sad, optimism-pessimism, and introversion-extraversion). However, seeming opposites frequently do not act the way researchers predict real opposites would: having correlations near -1, loading on the same factor, and having relations with external variables that are equal in magnitude and opposite in sign. We argue these predictions are often incorrect because the bipolar model has been misspecified or specified too narrowly. We therefore explicitly define a general bipolar model for ideal error-free data and then extend this model to empirical data influenced by random and systematic measurement error. Our model shows the predictions above are correct only under restrictive circumstances that are unlikely to apply in practice. Moreover, if a bipolar dimension is divided into two so that researchers can test bipolarity, our model shows that the correlation between the two can be far from -1; thus, strategies based upon Pearson product-moment correlations and their factor analyses do not test if variables are opposites. Moreover, the two parts need not be mutually exclusive; thus, measures of co-occurrence do not test if variables are opposites. We offer alternative strategies for testing if variables are opposites, strategies based upon censored data analysis. Our model and findings have implications not just for testing bipolarity, but also for associated theory and measurement, and they expose potential artifacts in correlational and dimensional analyses involving any type of negative relations. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Catriona Silvey, Zoltan Dienes, Elizabeth Wonnacott
{"title":"Bayes factors for logistic (mixed-effect) models.","authors":"Catriona Silvey, Zoltan Dienes, Elizabeth Wonnacott","doi":"10.1037/met0000714","DOIUrl":"https://doi.org/10.1037/met0000714","url":null,"abstract":"<p><p>In psychology, we often want to know whether or not an effect exists. The traditional way of answering this question is to use frequentist statistics. However, a significance test against a null hypothesis of no effect cannot distinguish between two states of affairs: evidence of absence of an effect and the absence of evidence for or against an effect. Bayes factors can make this distinction; however, uptake of Bayes factors in psychology has so far been low for two reasons. First, they require researchers to specify the range of effect sizes their theory predicts. Researchers are often unsure about how to do this, leading to the use of inappropriate default values which may give misleading results. Second, many implementations of Bayes factors have a substantial technical learning curve. We present a case study and simulations demonstrating a simple method for generating a range of plausible effect sizes, that is, a model of Hypothesis 1, for treatment effects where there is a binary-dependent variable. We illustrate this using mainly the estimates from frequentist logistic mixed-effects models (because of their widespread adoption) but also using Bayesian model comparison with Bayesian hierarchical models (which have increased flexibility). Bayes factors calculated using these estimates provide intuitively reasonable results across a range of real effect sizes. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The role of a quadratic term in estimating the average treatment effect from longitudinal randomized controlled trials with missing data.","authors":"Manshu Yang, Lijuan Wang, Scott E Maxwell","doi":"10.1037/met0000709","DOIUrl":"https://doi.org/10.1037/met0000709","url":null,"abstract":"<p><p>Longitudinal randomized controlled trials (RCTs) have been commonly used in psychological studies to evaluate the effectiveness of treatment or intervention strategies. Outcomes in longitudinal RCTs may follow either straight-line or curvilinear change trajectories over time, and missing data are almost inevitable in such trials. The current study aims to investigate (a) whether the estimate of average treatment effect (ATE) would be biased if a straight-line growth (SLG) model is fit to longitudinal RCT data with quadratic growth and missing completely at random (MCAR) or missing at random (MAR) data, and (b) whether adding a quadratic term to an SLG model would improve the ATE estimation and inference. Four models were compared via a simulation study, including the SLG model, the quadratic growth model with arm-invariant and fixed quadratic effect (QG-AIF), the quadratic growth model with arm-specific and fixed quadratic effects (QG-ASF), and the quadratic growth model with arm-specific and random quadratic effects (QG-ASR). Results suggest that fitting an SLG model to quadratic growth data often yielded severe biases in ATE estimates, even if data were MCAR or MAR. Given four or more waves of longitudinal data, the QG-ASR model outperformed the other methods; for three-wave data, the QG-ASR model was not applicable and the QG-ASF model performed well. Applications of different models are also illustrated using an empirical data example. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ottavia M Epifania, Pasquale Anselmi, Egidio Robusto
{"title":"A guided tutorial on linear mixed-effects models for the analysis of accuracies and response times in experiments with fully crossed design.","authors":"Ottavia M Epifania, Pasquale Anselmi, Egidio Robusto","doi":"10.1037/met0000708","DOIUrl":"https://doi.org/10.1037/met0000708","url":null,"abstract":"<p><p>Experiments with fully crossed designs are often used in experimental psychology spanning several fields, from cognitive psychology to social cognition. These experiments consist in the presentation of stimuli representing super-ordinate categories, which have to be sorted into the correct category in two contrasting conditions. This tutorial presents a linear mixed-effects model approach for obtaining Rasch-like parameterizations of response times and accuracies of fully crossed design data. The modeling framework for the analysis of fully crossed design data is outlined along with a step-by-step guide of its application, which is further illustrated with two practical examples based on empirical data. The first example regards a cognitive psychology experiment and pertains to the evaluation of a spatial-numerical association of response codes effect. The second one is based on a social cognition experiment for the implicit evaluation of racial attitudes. A fully commented R script for reproducing the analyses illustrated in the examples is available in the online supplemental materials. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Power analysis to detect misfit in SEMs with many items: Resolving unrecognized problems, relating old and new approaches, and \"matching\" power analysis approach to data analysis approach.","authors":"Amy Liang, Sonya K Sterba","doi":"10.1037/met0000684","DOIUrl":"https://doi.org/10.1037/met0000684","url":null,"abstract":"<p><p>It is unappreciated that there are four different approaches to power analysis for detecting misspecification by testing overall fit of structural equation models (SEMs) and, moreover, that common approaches can yield radically diverging results for SEMs with many items (high <i>p</i>). Here we newly relate these four approaches. Analytical power analysis methods using theoretical null and theoretical alternative distributions (Approach 1) have a long history, are widespread, and are often contrasted with \"the\" Monte Carlo method-which is an oversimplification. Actually, three Monte Carlo methods can be distinguished; all use an empirical alternative distribution but differ regarding whether the null distribution is theoretical (Approach 2), empirical (Approach 3), or-as we newly propose and demonstrate the need for-adjusted empirical (Approach 4). Because these four approaches can yield radically diverging power results under high <i>p</i> (as demonstrated here), researchers need to \"match\" their a priori SEM power analysis approach to their later SEM data analysis approach for testing overall fit, once data are collected. Disturbingly, the most common power analysis approach for a global test-of-fit is mismatched with the most common data analysis approach for a global test-of-fit in SEM. Because of this mismatch, researchers' anticipated versus actual/obtained power can differ substantially. We explain how/why to \"match\" across power-analysis and data-analysis phases of a study and provide software to facilitate doing so. As extensions, we explain how to relate and implement all four approaches to power analysis (a) for testing overall fit using χ² versus root-mean-square error of approximation and (b) for testing overall fit versus testing a target parameter/effect. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Definition and identification of causal ratio effects.","authors":"Christoph Kiefer, Benedikt Lugauer, Axel Mayer","doi":"10.1037/met0000711","DOIUrl":"https://doi.org/10.1037/met0000711","url":null,"abstract":"<p><p>In generalized linear models, the effect of a treatment or intervention is often expressed as a ratio (e.g., risk ratio and odds ratio). There is discussion about when ratio effect measures can be interpreted in a causal way. For example, ratio effect measures suffer from noncollapsibility, that is, even in randomized experiments, the average over individual ratio effects is not identical to the (unconditional) ratio effect based on group means. Even more, different ratio effect measures (e.g., simple ratio and odds ratio) can point into different directions regarding the effectiveness of the treatment making it difficult to decide which one is the causal effect of interest. While causality theories do in principle allow for ratio effects, the literature lacks a comprehensive derivation and definition of ratio effect measures and their possible identification from a causal perspective (including, but not restricted to randomized experiments). In this article, we show how both simple ratios and odds ratios can be defined based on the stochastic theory of causal effects. Then, we examine if and how expectations of these effect measures can be identified under four causality conditions. Finally, we discuss an alternative computation of ratio effects as ratios of causally unbiased expectations instead of expectations of individual ratios, which is identifiable under all causality conditions and consistent with difference effects. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Craig K Enders, Juan Diego Vera, Brian T Keller, Agatha Lenartowicz, Sandra K Loo
{"title":"Building a simpler moderated nonlinear factor analysis model with Markov Chain Monte Carlo estimation.","authors":"Craig K Enders, Juan Diego Vera, Brian T Keller, Agatha Lenartowicz, Sandra K Loo","doi":"10.1037/met0000712","DOIUrl":"https://doi.org/10.1037/met0000712","url":null,"abstract":"<p><p>Moderated nonlinear factor analysis (MNLFA) has emerged as an important and flexible data analysis tool, particularly in integrative data analysis setting and psychometric studies of measurement invariance and differential item functioning. Substantive applications abound in the literature and span a broad range of disciplines. MNLFA unifies item response theory, multiple group, and multiple indicator multiple cause modeling traditions, and it extends these frameworks by conceptualizing latent variable heterogeneity as a source of differential item functioning. The purpose of this article was to illustrate a flexible Markov chain Monte Carlo-based approach to MNLFA that offers statistical and practical enhancements to likelihood-based estimation while remaining plug and play with established analytic practices. Among other things, these enhancements include (a) missing data handling functionality for incomplete moderators, (b) multiply imputed factor score estimates that integrate into existing multiple imputation inferential methods, (c) support for common data types, including normal/continuous, nonnormal/continuous, binary, ordinal, multicategorical nominal, count, and two-part constructions for floor and ceiling effects, (d) novel residual diagnostics for identifying potential sources of differential item function, (e) manifest-by-latent variable interaction effects that replace complex moderation function constraints, and (f) integration with familiar regression modeling strategies, including graphical diagnostics. A real data analysis example using the Blimp software application illustrates these features. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Better power by design: Permuted-subblock randomization boosts power in repeated-measures experiments.","authors":"Jinghui Liang, Dale J Barr","doi":"10.1037/met0000717","DOIUrl":"https://doi.org/10.1037/met0000717","url":null,"abstract":"<p><p>During an experimental session, participants adapt and change due to learning, fatigue, fluctuations in attention, or other physiological or environmental changes. This temporal variation affects measurement, potentially reducing statistical power. We introduce a restricted randomization algorithm, permuted-subblock randomization (PSR), that boosts power by balancing experimental conditions over the course of an experimental session. We used Monte Carlo simulations to explore the performance of PSR across four scenarios of time-dependent error: exponential decay (learning effect), Gaussian random walk, pink noise, and a mixture of the previous three. PSR boosted power by about 13% on average, with a range from 4% to 45% across a representative set of study designs, while simultaneously controlling the false positive rate when time-dependent variation was absent. An R package, explan, provides functions to implement PSR during experiment planning. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tiffany A Whittaker, Jihyun Lee, Devin Dedrick, Christina Muñoz
{"title":"Meta-analysis of Monte Carlo simulations examining class enumeration accuracy with mixture models.","authors":"Tiffany A Whittaker, Jihyun Lee, Devin Dedrick, Christina Muñoz","doi":"10.1037/met0000716","DOIUrl":"https://doi.org/10.1037/met0000716","url":null,"abstract":"<p><p>This article walks through steps to conduct a meta-analysis of Monte Carlo simulation studies. The selected Monte Carlo simulation studies focused on mixture modeling, which is becoming increasingly popular in the social and behavioral sciences. We provide details for the following steps in a meta-analysis: (a) formulating a research question; (b) identifying the relevant literature; (c) screening of the literature; (d) extracting data; (e) analyzing the data; and (f) interpreting and discussing the findings. Our goal was to investigate which simulation design factors (moderators) impact class enumeration accuracy in mixture modeling analyses. We analyzed the meta-analytic data using a generalized linear mixed model with a multilevel structure and examined the impact of the design moderators on the outcome of interest with a meta-regression model. For instance, the Bayesian information criterion was found to perform more accurately in conditions with larger sample sizes whereas entropy was found to perform more accurately with smaller sample sizes. It is hoped that this article can serve as a guide for others to follow in order to quantitatively synthesize results from Monte Carlo simulation studies. In turn, the findings from meta-analyzing Monte Carlo simulation studies can provide more details about factors that influence outcomes of interest as well as help methodologists when planning Monte Carlo simulation studies. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Planning falsifiable confirmatory research.","authors":"James E Kennedy","doi":"10.1037/met0000639","DOIUrl":"https://doi.org/10.1037/met0000639","url":null,"abstract":"<p><p>Falsifiable research is a basic goal of science and is needed for science to be self-correcting. However, the methods for conducting falsifiable research are not widely known among psychological researchers. Describing the effect sizes that can be confidently investigated in confirmatory research is as important as describing the subject population. Power curves or operating characteristics provide this information and are needed for both frequentist and Bayesian analyses. These evaluations of inferential error rates indicate the performance (validity and reliability) of the planned statistical analysis. For meaningful, falsifiable research, the study plan should specify a minimum effect size that is the goal of the study. If any tiny effect, no matter how small, is considered meaningful evidence, the research is not falsifiable and often has negligible predictive value. Power ≥ .95 for the minimum effect is optimal for confirmatory research and .90 is good. From a frequentist perspective, the statistical model for the alternative hypothesis in the power analysis can be used to obtain a <i>p</i> value that can reject the alternative hypothesis, analogous to rejecting the null hypothesis. However, confidence intervals generally provide more intuitive and more informative inferences than p values. The preregistration for falsifiable confirmatory research should include (a) criteria for evidence the alternative hypothesis is true, (b) criteria for evidence the alternative hypothesis is false, and (c) criteria for outcomes that will be inconclusive. Not all confirmatory studies are or need to be falsifiable. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}