{"title":"Multi-Battery Factor Analysis in R.","authors":"Niels G Waller, Casey Giordano","doi":"10.1177/01466216211066604","DOIUrl":"https://doi.org/10.1177/01466216211066604","url":null,"abstract":"Inter-battery factor analysis (IBFA) is a multivariate technique for evaluating the stability of common factors across two test batteries that have been administered to the same individuals. Tucker (1958) introduced the model in the late 1950s and derived the least squares solution for estimating model parameters. Two decades later, Browne (1979) extended Tucker’s work by (a) deriving the maximum-likelihood (ML) model estimates and (b) enabling the model to accommodate two or more test batteries (Browne, 1980). Browne’s extended model is called multiple-battery factor analysis (MBFA). Influenced by Browne’s ideas, Cudeck (1980) produced a FORTRAN program for MBFA (Cudeck, 1982) and a readable account of the method’s underlying logic. For many years, this program was the primary vehicle for conducting MBFA in a Window’s environment (Brown, 2007; Finch & West, 1997; Finch et al., 1999, Waller et al., 1991). Unfortunately, until now, open-source software for conducting IBFA and MBFA on Windows, Mac OS, Linux, and Unix operating systems was not available. To introduce the ideas of Tucker (1958) and Browne (1979, 1980) to the broader research community, two open-source programs were developed in R (R Core Team, 2021) for obtaining ML estimates for the inter-battery and MBFA models. The programs are called faIB and faMB. Both programs are included in the R fungible (Waller, 2021) library and can be freely downloaded from the Comprehensive R Archive Network (CRAN; https://cran.r-project.org/package= fungible). faIB and faMB include a number of features that make them attractive choices for extracting common factors from two or more batteries. For instance, both programs include a wide range of rotation options by building upon functionality from the GPArotation package (Bernaards & Jennrich, 2005). This package provides routines for rotating factors by oblimin, geomin (orthogonal and oblique), infomax, simplimax, varimax, promax, and many other rotation algorithms. Both programs also allow users to initiate factor rotations from random starting configurations to facilitate the location of global and local solutions (for a discussion of why feature this is important, see Rozeboom, 1992). Prior to rotation, factors can be preconditioned (i.e., row standardized) by methods described by Kaiser (1958) or Cureton and Mulaik (1975). After rotation, factor loadings can be sorted within batteries to elucidate the structure of the","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 2","pages":"156-158"},"PeriodicalIF":1.2,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908409/pdf/10.1177_01466216211066604.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10810180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Examining the Performance of the Trifactor Model for Multiple Raters.","authors":"James Soland, Megan Kuhfeld","doi":"10.1177/01466216211051728","DOIUrl":"https://doi.org/10.1177/01466216211051728","url":null,"abstract":"<p><p>Researchers in the social sciences often obtain ratings of a construct of interest provided by multiple raters. While using multiple raters provides a way to help avoid the subjectivity of any given person's responses, rater disagreement can be a problem. A variety of models exist to address rater disagreement in both structural equation modeling and item response theory frameworks. Recently, a model was developed by Bauer et al. (2013) and referred to as the \"trifactor model\" to provide applied researchers with a straightforward way of estimating scores that are purged of variance that is idiosyncratic by rater. Although the intent of the model is to be usable and interpretable, little is known about the circumstances under which it performs well, and those it does not. We conduct simulation studies to examine the performance of the trifactor model under a range of sample sizes and model specifications and then compare model fit, bias, and convergence rates.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 1","pages":"53-67"},"PeriodicalIF":1.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8655468/pdf/10.1177_01466216211051728.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10515110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantifying the Distorting Effect of Rapid Guessing on Estimates of Coefficient Αlpha.","authors":"Joseph A Rios, Jiayi Deng","doi":"10.1177/01466216211051719","DOIUrl":"10.1177/01466216211051719","url":null,"abstract":"<p><p>An underlying threat to the validity of reliability measures is the introduction of systematic variance in examinee scores from unintended constructs that differ from those assessed. One construct-irrelevant behavior that has gained increased attention in the literature is rapid guessing (RG), which occurs when examinees answer quickly with intentional disregard for item content. To examine the degree of distortion in coefficient alpha due to RG, this study compared alpha estimates between conditions in which simulees engaged in full solution (i.e., do not engage in RG) versus partial RG behavior. This was done by conducting a simulation study in which the percentage and ability characteristics of rapid responders as well as the percentage and pattern of RG were manipulated. After controlling for test length and difficulty, the average degree of distortion in estimates of coefficient alpha due to RG ranged from -.04 to .02 across 144 conditions. Although slight differences were noted between conditions differing in RG pattern and RG responder ability, the findings from this study suggest that estimates of coefficient alpha are largely robust to the presence of RG due to cognitive fatigue and a low perceived probability of success.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 1","pages":"40-52"},"PeriodicalIF":1.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8655465/pdf/10.1177_01466216211051719.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10515114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Explanatory Generalized Graded Unfolding Model: Incorporating Collateral Information to Improve the Latent Trait Estimation Accuracy.","authors":"Seang-Hwane Joo, Philseok Lee, Stephen Stark","doi":"10.1177/01466216211051717","DOIUrl":"https://doi.org/10.1177/01466216211051717","url":null,"abstract":"<p><p>Collateral information has been used to address subpopulation heterogeneity and increase estimation accuracy in some large-scale cognitive assessments. The methodology that takes collateral information into account has not been developed and explored in published research with models designed specifically for noncognitive measurement. Because the accurate noncognitive measurement is becoming increasingly important, we sought to examine the benefits of using collateral information in latent trait estimation with an item response theory model that has proven valuable for noncognitive testing, namely, the generalized graded unfolding model (GGUM). Our presentation introduces an extension of the GGUM that incorporates collateral information, henceforth called <i>Explanatory GGUM</i>. We then present a simulation study that examined Explanatory GGUM latent trait estimation as a function of sample size, test length, number of background covariates, and correlation between the covariates and the latent trait. Results indicated the Explanatory GGUM approach provides scoring accuracy and precision superior to traditional expected a posteriori (EAP) and full Bayesian (FB) methods. Implications and recommendations are discussed.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 1","pages":"3-18"},"PeriodicalIF":1.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8655467/pdf/10.1177_01466216211051717.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10806575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DIFSIB: A SIBTEST Package.","authors":"James D Weese","doi":"10.1177/01466216211040498","DOIUrl":"https://doi.org/10.1177/01466216211040498","url":null,"abstract":"<p><p>The R package DIFSIB provides a direct translated version of the SIBTEST, Crossing- SIBTEST, and POLYSIBTEST procedures that were last updated and released in 2005. Having these functions directly written from Fortran into R code will allow researchers and practitioners to easily access the most recent versions of these procedures when they are conducting differential item functioning analysis and continue to improve the software more easily.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 1","pages":"68-69"},"PeriodicalIF":1.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8655466/pdf/10.1177_01466216211040498.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10806578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Lack of Robustness of a Statistic Based on the Neyman-Pearson Lemma to Violations of Its Underlying Assumptions.","authors":"Sandip Sinharay","doi":"10.1177/01466216211049209","DOIUrl":"https://doi.org/10.1177/01466216211049209","url":null,"abstract":"<p><p>Drasgow, Levine, and Zickar (1996) suggested a statistic based on the Neyman-Pearson lemma (NPL; e.g., Lehmann & Romano, 2005, p. 60) for detecting preknowledge on a known set of items. The statistic is a special case of the optimal appropriateness indices (OAIs) of Levine and Drasgow (1988) and is the most powerful statistic for detecting item preknowledge when the assumptions underlying the statistic hold for the data (e.g., Belov, 2016Belov, 2016; Drasgow et al., 1996). This paper demonstrated using real data analysis that one assumption underlying the statistic of Drasgow et al. (1996) is often likely to be violated in practice. This paper also demonstrated, using simulated data, that the statistic is not robust to realistic violations of its underlying assumptions. Together, the results from the real data and the simulations demonstrate that the statistic of Drasgow et al. (1996) may not always be the optimum statistic in practice and occasionally has smaller power than another statistic for detecting preknowledge on a known set of items, especially when the assumptions underlying the former statistic do not hold. The findings of this paper demonstrate the importance of keeping in mind the assumptions underlying and the limitations of any statistic or method.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 1","pages":"19-39"},"PeriodicalIF":1.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8655463/pdf/10.1177_01466216211049209.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10515568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seung W Choi, Sangdon Lim, Luping Niu, Sooyong Lee, Christina M Schneider, Jay Lee, Garron J Gianopulos
{"title":"maat: An R Package for Multiple Administrations Adaptive Testing.","authors":"Seung W Choi, Sangdon Lim, Luping Niu, Sooyong Lee, Christina M Schneider, Jay Lee, Garron J Gianopulos","doi":"10.1177/01466216211049212","DOIUrl":"https://doi.org/10.1177/01466216211049212","url":null,"abstract":"<p><p>Multiple Administrations Adaptive Testing (MAAT) is an extension of the shadow-test approach to CAT for the assessment framework involving multiple tests administered periodically throughout the year. The maat package utilizes multiple item pools vertically scaled across grades and multiple phases (stages) within each test administration, allowing for transitioning from an item pool to another as deemed necessary to further enhance the quality of assessment.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 1","pages":"73-74"},"PeriodicalIF":1.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8655464/pdf/10.1177_01466216211049212.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10515109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"autoFC: An R Package for Automatic Item Pairing in Forced-Choice Test Construction.","authors":"Mengtong Li, Tianjun Sun, Bo Zhang","doi":"10.1177/01466216211051726","DOIUrl":"https://doi.org/10.1177/01466216211051726","url":null,"abstract":"<p><p>Recently, there has been increasing interest in adopting the forced-choice (FC) test format in non-cognitive assessments, as it demonstrates faking resistance when well-designed. However, traditional or manual pairing approaches to FC test construction are time- and effort- intensive and often involve insufficient considerations. To address these issues, we developed the new open-source <i>autoFC</i> R package to facilitate automated and optimized item pairing strategies. The <i>autoFC</i> package is intended as a practical tool for FC test constructions. Users can easily obtain automatically optimized FC tests by simply inputting the item characteristics of interest. Customizations are also available for considerations on matching rules and the behaviors of the optimization process. The <i>autoFC</i> package should be of interest to researchers and practitioners constructing FC scales with potentially many metrics to match on and/or many items to pair, essentially exempting users from the burden of manual item pairing and reducing the computational costs and biases induced by simple ranking methods.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 1","pages":"70-72"},"PeriodicalIF":1.2,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8655462/pdf/10.1177_01466216211051726.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10515111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How Important is the Choice of Bandwidth in Kernel Equating?","authors":"Gabriel Wallin, Jenny Häggström, Marie Wiberg","doi":"10.1177/01466216211040486","DOIUrl":"https://doi.org/10.1177/01466216211040486","url":null,"abstract":"<p><p>Kernel equating uses kernel smoothing techniques to continuize the discrete score distributions when equating test scores from an assessment test. The degree of smoothness of the continuous approximations is determined by the bandwidth. Four bandwidth selection methods are currently available for kernel equating, but no thorough comparison has been made between these methods. The overall aim is to compare these four methods together with two additional methods based on cross-validation in a simulation study. Both equivalent and non-equivalent group designs are used and the number of test takers, test length, and score distributions are all varied. The results show that sample size and test length are important factors for equating accuracy and precision. However, all bandwidth selection methods perform similarly with regards to the mean squared error and the differences in terms of equated scores are small, suggesting that the choice of bandwidth is not critical. The different bandwidth selection methods are also illustrated using real testing data from a college admissions test. Practical implications of the results from the simulation study and the empirical study are discussed.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"45 7-8","pages":"518-535"},"PeriodicalIF":1.2,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8640352/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39693868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cut-Score Operating Function Extensions: Penalty-Based Errors and Uncertainty in Standard Settings.","authors":"Irina Grabovsky, Jesse Pace, Christopher Runyon","doi":"10.1177/01466216211046896","DOIUrl":"https://doi.org/10.1177/01466216211046896","url":null,"abstract":"<p><p>We model pass/fail examinations aiming to provide a systematic tool to minimize classification errors. We use the method of cut-score operating functions to generate specific cut-scores on the basis of minimizing several important misclassification measures. The goal of this research is to examine the combined effects of a known distribution of examinee abilities and uncertainty in the standard setting on the optimal choice of the cut-score. In addition, we describe an online application that allows others to utilize the cut-score operating function for their own standard settings.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"45 7-8","pages":"536-550"},"PeriodicalIF":1.2,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8640351/pdf/10.1177_01466216211046896.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39693869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}