Psychological methodsPub Date : 2024-04-01Epub Date: 2022-05-23DOI: 10.1037/met0000498
Qian Zhang
{"title":"Meta-analysis of correlation coefficients: A cautionary tale on treating measurement error.","authors":"Qian Zhang","doi":"10.1037/met0000498","DOIUrl":"10.1037/met0000498","url":null,"abstract":"<p><p>A scale to measure a psychological construct is subject to measurement error. When meta-analyzing correlations obtained from scale scores, many researchers recommend correcting for measurement error. I considered three caveats when correcting for measurement error in meta-analysis of correlations: (a) the distribution of true scores can be non-normal, resulting in violation of the normality assumption for raw correlations and Fisher's z transformed correlations; (b) coefficient alpha is often used as the reliability, but correlations corrected for measurement error using alpha can be inaccurate when some assumptions of alpha (e.g., tau-equivalence) are violated; and (c) item scores are often ordinal, making the disattenuation formula potentially problematic. Via three simulation studies, I examined the performance of two meta-analysis approaches-with raw correlations and z scores. In terms of estimation accuracy and coverage probability of the mean correlation, results showed that (a) considering the true-score distribution alone, estimation of the mean correlation was slightly worse when true scores of the constructs were skewed rather than normal; (b) when the tau-equivalence assumption was violated and coefficient alpha was used for correcting measurement error, the mean correlation estimates can be biased and coverage probabilities can be low; and (c) discretization of continuous items can result in biased estimates and undercoverage of the mean correlations even when tau-equivalence was satisfied. With more categories and/or items on a scale, results can improve whether tau-equivalence was met or not. Based on these findings, I gave recommendations for conducting meta-analyses of correlations. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"308-330"},"PeriodicalIF":7.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139747300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Psychological methodsPub Date : 2024-04-01Epub Date: 2022-05-12DOI: 10.1037/met0000502
T D Stanley, Hristos Doucouliagos
{"title":"Harnessing the power of excess statistical significance: Weighted and iterative least squares.","authors":"T D Stanley, Hristos Doucouliagos","doi":"10.1037/met0000502","DOIUrl":"10.1037/met0000502","url":null,"abstract":"<p><p>We introduce a new meta-analysis estimator, the weighted and iterated least squares (WILS), that greatly reduces publication selection bias (PSB) when selective reporting for statistical significance (SSS) is present. WILS is the simple weighted average that has smaller bias and rates of false positives than conventional meta-analysis estimators, the unrestricted weighted least squares (UWLS), and the weighted average of the adequately powered (WAAP) when there is SSS. As a simple weighted average, it is not vulnerable to violations in publication bias corrections models' assumptions too often seen in application. WILS is based on the novel idea of allowing excess statistical significance (ESS), which is a necessary condition of SSS, to identify when and how to reduce PSB. We show in comparisons with large-scale preregistered replications and in evidence-based simulations that the remaining bias is small. The routine application of WILS in the place of random effects would do much to reduce conventional meta-analysis's notable biases and high rates of false positives. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"407-420"},"PeriodicalIF":7.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41210714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Beta-binomial meta-analysis of individual differences based on sample means and standard deviations: Studying reliability of sum scores of binary items.","authors":"Philipp Doebler, Susanne Frick, Anna Doebler","doi":"10.1037/met0000649","DOIUrl":"https://doi.org/10.1037/met0000649","url":null,"abstract":"<p><p>Individual differences are studied with a multitude of test instruments. Meta-analysis of tests is useful to understand whether individual differences in certain populations can be detected with the help of a class of tests. A method for the quantitative meta-analytical evaluation of test instruments with dichotomous items is introduced. The method assumes beta-binomially distributed test scores, an assumption that has been demonstrated to be plausible in many settings. With this assumption, the method only requires sample means and standard deviations of sum scores (or equivalently means and standard deviations of percent-correct scores), in contrast to methods that use estimates of reliability for a similar purpose. Two parameters are estimated for each sample: mean difficulty and an overdispersion parameter which can be interpreted as the test's ability to detect individual differences. The proposed bivariate meta-analytical approach (random or fixed effects) pools the two parameters simultaneously and allows to perform meta-regression. The bivariate pooling yields a between-sample correlation of mean difficulty parameters and overdispersion parameters. As a side product, reliability estimates are obtained which can be employed to disattenuate correlation coefficients for insufficient reliability when no other estimates are available. A worked example illustrates the method and R code is provided. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140132459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Esther Ulitzsch, Steffen Nestler, Oliver Lüdtke, Gabriel Nagy
{"title":"A screen-time-based mixture model for identifying and monitoring careless and insufficient effort responding in ecological momentary assessment data.","authors":"Esther Ulitzsch, Steffen Nestler, Oliver Lüdtke, Gabriel Nagy","doi":"10.1037/met0000636","DOIUrl":"https://doi.org/10.1037/met0000636","url":null,"abstract":"<p><p>Ecological momentary assessment (EMA) involves repeated real-time sampling of respondents' current behaviors and experiences. The intensive repeated assessment imposes an increased burden on respondents, rendering EMAs vulnerable to respondent noncompliance and/or careless and insufficient effort responding (C/IER). We developed a mixture modeling approach that equips researchers with a tool for (a) gauging the degree of C/IER contamination of their EMA data and (b) studying the trajectory of C/IER across the study. For separating attentive from C/IER behavior, the approach leverages collateral information from screen times, which are routinely recorded in electronically administered EMAs, and translates theoretical considerations on respondents' behavior into component models for attentive and careless screen times as well as for the functional form of C/IER trajectories. We show how a sensible choice of component models (a) allows disentangling short screen times due to C/IER from familiarity effects due to repeated exposure to the same measures, (b) aids in gaining a fine-grained understanding of C/IER trajectories by distinguishing within-day from between-day effects, and (c) allows investigating interindividual differences in attentiveness. The approach shows good parameter recovery when attentive and C/IER screen time distributions exhibit sufficient separation and yields valid conclusions even in scenarios of uncontaminated data. The approach is illustrated on EMA data from the German Socio-Economic Panel innovation sample. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139997241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The monotonic linear model: Testing for removable interactions.","authors":"John C Dunn, Laura M Anderson","doi":"10.1037/met0000626","DOIUrl":"https://doi.org/10.1037/met0000626","url":null,"abstract":"<p><p>Loftus (1978) highlighted the distinction between a theoretical concept such as memory or attention, and its observed measure such as hit rate or percent correct. If the functional relationship between the concept and its measure is nonlinear then only some interaction effects are interpretable. This is an example of the wider \"problem of coordination\" which pervades scientific measurement. Loftus drew on the principles of additive conjoint measurement (ACM) to discuss the consequences when the coordination function is assumed to be monotonic. This led to the distinction between removable interactions that are consistent with an additive effect on the underlying theoretical concept and nonremovable interactions that are not. However, the adoption of these ideas by researchers has been greatly limited by the fact that no statistical procedure exists to determine if and to what extent an interaction is removable or otherwise. The lack of such a procedure has similarly limited the impact of ACM on research practice. The aim of this article is to present such a procedure. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139997242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimating curvilinear time-varying treatment effects: Combining g-estimation of structural nested mean models with time-varying effect models for longitudinal causal inference.","authors":"Wen Wei Loh","doi":"10.1037/met0000637","DOIUrl":"https://doi.org/10.1037/met0000637","url":null,"abstract":"<p><p>Longitudinal designs can fortify causal inquiries of a focal predictor (i.e., treatment) on an outcome. But valid causal inferences are complicated by causal feedback between confounders and treatment over time. G-estimation of a structural nested mean model (SNMM) is designed to handle the complexities beset by measured time-varying or treatment-dependent confounding in longitudinal data. But valid inference requires correctly specifying the functional form of the SNMM, such as how the effects stay constant or change over time. In this article, we develop a g-estimation strategy for linear structural nested mean models whose causal parameters adopt the form of time-varying coefficient functions. These time-varying coefficient functions are smooth semiparametric functions of time that permit probing how the treatment effects may change curvilinearly. Further effect modification by time-invariant and time-varying covariates can be readily postulated in the SNMM to test fine-grained effect heterogeneity. We then describe a g-estimation strategy for estimating such an SNMM. We utilize the established time-varying effect model (TVEM) approach from the prevention and psychotherapy research literature for modeling flexible changes in covariate-outcome associations over time. Moreover, we exploit a known benefit of g-estimation over routine regression methods: its double robustness conferring protection against biases induced by certain forms of model misspecification. We encourage psychology researchers seeking correct causal conclusions from longitudinal data to use an SNMM with time-varying coefficient functions to assess curvilinear causal effects over time, and to use g-estimation with TVEM to resolve measured treatment-dependent confounding. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139735974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Linear mixed models and latent growth curve models for group comparison studies contaminated by outliers.","authors":"Fabio Mason, Eva Cantoni, Paolo Ghisletta","doi":"10.1037/met0000643","DOIUrl":"10.1037/met0000643","url":null,"abstract":"<p><p>The linear mixed model (LMM) and latent growth model (LGM) are frequently applied to within-subject two-group comparison studies to investigate group differences in the time effect, supposedly due to differential group treatments. Yet, research about LMM and LGM in the presence of outliers (defined as observations with a very low probability of occurrence if assumed from a given distribution) is scarce. Moreover, when such research exists, it focuses on estimation properties (bias and efficiency), neglecting inferential characteristics (e.g., power and type-I error). We study power and type-I error rates of Wald-type and bootstrap confidence intervals (CIs), as well as coverage and length of CIs and mean absolute error (MAE) of estimates, associated with classical and robust estimations of LMM and LGM, applied to a within-subject two-group comparison design. We conduct a Monte Carlo simulation experiment to compare CIs and MAEs under different conditions: data (a) without contamination, (b) contaminated by within-subject outliers, (c) contaminated by between-subject outliers, and (d) both contaminated by within- and between-subject outliers. Results show that without contamination, methods perform similarly, except CIs based on S, a robust LMM estimator, which are slightly less close to nominal values in their coverage. However, in the presence of both within- and between-subject outliers, CIs based on robust estimators, especially S, performed better than those of classical methods. In particular, the percentile CI with the wild bootstrap applied to the robust LMM estimators outperformed all other methods, especially with between-subject outliers, when we found the classical Wald-type CI based on the t statistic with Satterthwaite approximation for LMM to be highly misleading. We provide R code to compute all methods presented here. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139735975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Individual-level probabilities and cluster-level proportions: Toward interpretable level 2 estimates in unconflated multilevel models for binary outcomes.","authors":"Timothy Hayes","doi":"10.1037/met0000646","DOIUrl":"https://doi.org/10.1037/met0000646","url":null,"abstract":"<p><p>Multilevel models allow researchers to test hypotheses at multiple levels of analysis-for example, assessing the effects of both individual-level and school-level predictors on a target outcome. To assess these effects with the greatest clarity, researchers are well-advised to cluster mean center all Level 1 predictors and explicitly incorporate the cluster means into the model at Level 2. When an outcome of interest is continuous, this unconflated model specification serves both to increase model accuracy, by separating the level-specific effects of each predictor, and to increase model interpretability, by reframing the random intercepts as unadjusted cluster means. When an outcome of interest is binary or ordinal, however, only the first of these benefits is fully realized: In these models, the intuitive cluster mean interpretations of Level 2 effects are only available on the metric of the linear predictor (e.g., the logit) or, equivalently, the latent response propensity, <i>y</i><sub>ij</sub>∗. Because the calculations for obtaining predicted probabilities, odds, and <i>OR</i>s operate on the entire combined model equation, the interpretations of these quantities are inextricably tied to individual-level, rather than cluster-level, outcomes. This is unfortunate, given that the probability and odds metrics are often of greatest interest to researchers in practice. To address this issue, I propose a novel rescaling method designed to calculate cluster average success proportions, odds, and <i>OR</i>s in two-level binary and ordinal logistic and probit models. I apply the approach to a real data example and provide supplemental R functions to help users implement the method easily. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139707688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Summed versus estimated factor scores: Considering uncertainties when using observed scores.","authors":"Yang Liu, Jolynn Pek","doi":"10.1037/met0000644","DOIUrl":"https://doi.org/10.1037/met0000644","url":null,"abstract":"<p><p>Observed scores (e.g., summed scores and estimated factor scores) are assumed to reflect underlying constructs and have many uses in psychological science. Constructs are often operationalized as latent variables (LVs), which are mathematically defined by their relations with manifest variables in an LV measurement model (e.g., common factor model). We examine the performance of several types of observed scores for the purposes of (a) estimating latent scores and classifying people and (b) recovering structural relations among LVs. To better reflect practice, our evaluation takes into account different sources of uncertainty (i.e., sampling error and model error). We review psychometric properties of observed scores based on the classical test theory applied to common factor models, report on a simulation study examining their performance, and provide two empirical examples to illustrate how different scores perform under different conditions of reliability, sample size, and model error. We conclude with general recommendations for using observed scores and discuss future research directions. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139707635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angelika M Stefan, Quentin F Gronau, Eric-Jan Wagenmakers
{"title":"Interim design analysis using Bayes factor forecasts.","authors":"Angelika M Stefan, Quentin F Gronau, Eric-Jan Wagenmakers","doi":"10.1037/met0000641","DOIUrl":"https://doi.org/10.1037/met0000641","url":null,"abstract":"<p><p>A fundamental part of experimental design is to determine the sample size of a study. However, sparse information about population parameters and effect sizes before data collection renders effective sample size planning challenging. Specifically, sparse information may lead research designs to be based on inaccurate a priori assumptions, causing studies to use resources inefficiently or to produce inconclusive results. Despite its deleterious impact on sample size planning, many prominent methods for experimental design fail to adequately address the challenge of sparse a-priori information. Here we propose a Bayesian Monte Carlo methodology for interim design analyses that allows researchers to analyze and adapt their sampling plans throughout the course of a study. At any point in time, the methodology uses the best available knowledge about parameters to make projections about expected evidence trajectories. Two simulated application examples demonstrate how interim design analyses can be integrated into common designs to inform sampling plans on the fly. The proposed methodology addresses the problem of sample size planning with sparse a-priori information and yields research designs that are efficient, informative, and flexible. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139707689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}