William F Christensen, Melanie M Wall, Irini Moustaki
{"title":"Assessing Dimensionality in Dichotomous Items When Many Subjects Have All-Zero Responses: An Example From Psychiatry and a Solution Using Mixture Models.","authors":"William F Christensen, Melanie M Wall, Irini Moustaki","doi":"10.1177/01466216211066602","DOIUrl":"https://doi.org/10.1177/01466216211066602","url":null,"abstract":"<p><p>Common methods for determining the number of latent dimensions underlying an item set include eigenvalue analysis and examination of fit statistics for factor analysis models with varying number of factors. Given a set of dichotomous items, the authors demonstrate that these empirical assessments of dimensionality often incorrectly estimate the number of dimensions when there is a preponderance of individuals in the sample with all-zeros as their responses, for example, not endorsing any symptoms on a health battery. Simulated data experiments are conducted to demonstrate when each of several common diagnostics of dimensionality can be expected to under- or over-estimate the true dimensionality of the underlying latent variable. An example is shown from psychiatry assessing the dimensionality of a social anxiety disorder battery where 1, 2, 3, or more factors are identified, depending on the method of dimensionality assessment. An all-zero inflated exploratory factor analysis model (AZ-EFA) is introduced for assessing the dimensionality of the underlying subgroup corresponding to those possessing the measurable trait. The AZ-EFA approach is demonstrated using simulation experiments and an example measuring social anxiety disorder from a large nationally representative survey. Implications of the findings are discussed, in particular, regarding the potential for different findings in community versus patient populations.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 3","pages":"167-184"},"PeriodicalIF":1.2,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9073639/pdf/10.1177_01466216211066602.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9748243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reducing the Misclassification Costs of Cognitive Diagnosis Computerized Adaptive Testing: Item Selection With Minimum Expected Risk.","authors":"Chia-Ling Hsu, Wen-Chung Wang","doi":"10.1177/01466216211066610","DOIUrl":"https://doi.org/10.1177/01466216211066610","url":null,"abstract":"<p><p>Cognitive diagnosis computerized adaptive testing (CD-CAT) aims to identify each examinee's strengths and weaknesses on latent attributes for appropriate classification into an attribute profile. As the cost of a CD-CAT misclassification differs across user needs (e.g., remedial program vs. scholarship eligibilities), item selection can incorporate such costs to improve measurement efficiency. This study proposes such a method, <i>minimum expected risk</i> (MER), based on Bayesian decision theory. According to simulations, using MER to identify examinees with no mastery (MER-U0) or full mastery (MER-U1) showed greater classification accuracy and efficiency than other methods for these attribute profiles, especially for shorter tests or low quality item banks. For other attribute profiles, regardless of item quality or termination criterion, MER methods, modified posterior-weighted Kullback-Leibler information (MPWKL), posterior-weighted CDM discrimination index (PWCDI), and Shannon entropy (SHE) performed similarly and outperformed posterior-weighted attribute-level CDM discrimination index (PWACDI) in classification accuracy and test efficiency, especially on short tests. MER with a zero-one loss function, MER-U0, MER-U1, and PWACDI utilized item banks more effectively than the other methods. Overall, these results show the feasibility of using MER in CD-CAT to increase the accuracy for specific attribute profiles to address different user needs.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 3","pages":"185-199"},"PeriodicalIF":1.2,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9073635/pdf/10.1177_01466216211066610.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9748238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Comparison of Robust Likelihood Estimators to Mitigate Bias From Rapid Guessing.","authors":"Joseph A Rios","doi":"10.1177/01466216221084371","DOIUrl":"https://doi.org/10.1177/01466216221084371","url":null,"abstract":"<p><p>Rapid guessing (RG) behavior can undermine measurement property and score-based inferences. To mitigate this potential bias, practitioners have relied on response time information to identify and filter RG responses. However, response times may be unavailable in many testing contexts, such as paper-and-pencil administrations. When this is the case, self-report measures of effort and person-fit statistics have been used. These methods are limited in that inferences concerning motivation and aberrant responding are made at the examinee level. As test takers can engage in a mixture of solution and RG behavior throughout a test administration, there is a need to limit the influence of potential aberrant responses at the item level. This can be done by employing robust estimation procedures. Since these estimators have received limited attention in the RG literature, the objective of this simulation study was to evaluate ability parameter estimation accuracy in the presence of RG by comparing maximum likelihood estimation (MLE) to two robust variants, the bisquare and Huber estimators. Two RG conditions were manipulated, RG percentage (10%, 20%, and 40%) and pattern (difficulty-based and changing state). Contrasted to the MLE procedure, results demonstrated that both the bisquare and Huber estimators reduced bias in ability parameter estimates by as much as 94%. Given that the Huber estimator showed smaller standard deviations of error and performed equally as well as the bisquare approach under most conditions, it is recommended as a promising approach to mitigating bias from RG when response time information is unavailable.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 3","pages":"236-249"},"PeriodicalIF":1.2,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9073634/pdf/10.1177_01466216221084371.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9748240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BayMDS: An R Package for Bayesian Multidimensional Scaling and Choice of Dimension.","authors":"Man-Suk Oh, Eun-Kyung Lee","doi":"10.1177/01466216221084219","DOIUrl":"https://doi.org/10.1177/01466216221084219","url":null,"abstract":"MDSIC computes and plots MDSIC that can be used to select optimal number of dimensions for a given data set. There are also a few plot functions. plotObj shows pairwise scatter plots of object con fi guration in a Euclidean space for the fi rst three dimensions. plotTrace provides trace plots of parameter samples for visual inspection of MCMC convergence. plotDelDist plots the observed dissimilarity measures versus Euclidean distances computed from BMDS object con fi guration. bayMDSApp shows the results of bayMDS in a web-based GUI (graphical user","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 3","pages":"250-251"},"PeriodicalIF":1.2,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9073637/pdf/10.1177_01466216221084219.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9748237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Bolsinova, Benjamin E. Deonovic, Meirav Arieli-Attali, Burr Settles, Masato Hagiwara, G. Maris
{"title":"Measurement of Ability in Adaptive Learning and Assessment Systems when Learners Use On-Demand Hints","authors":"M. Bolsinova, Benjamin E. Deonovic, Meirav Arieli-Attali, Burr Settles, Masato Hagiwara, G. Maris","doi":"10.1177/01466216221084208","DOIUrl":"https://doi.org/10.1177/01466216221084208","url":null,"abstract":"Adaptive learning and assessment systems support learners in acquiring knowledge and skills in a particular domain. The learners’ progress is monitored through them solving items matching their level and aiming at specific learning goals. Scaffolding and providing learners with hints are powerful tools in helping the learning process. One way of introducing hints is to make hint use the choice of the student. When the learner is certain of their response, they answer without hints, but if the learner is not certain or does not know how to approach the item they can request a hint. We develop measurement models for applications where such on-demand hints are available. Such models take into account that hint use may be informative of ability, but at the same time may be influenced by other individual characteristics. Two modeling strategies are considered: (1) The measurement model is based on a scoring rule for ability which includes both response accuracy and hint use. (2) The choice to use hints and response accuracy conditional on this choice are modeled jointly using Item Response Tree models. The properties of different models and their implications are discussed. An application to data from Duolingo, an adaptive language learning system, is presented. Here, the best model is the scoring-rule-based model with full credit for correct responses without hints, partial credit for correct responses with hints, and no credit for all incorrect responses. The second dimension in the model accounts for the individual differences in the tendency to use hints.","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 1","pages":"219 - 235"},"PeriodicalIF":1.2,"publicationDate":"2022-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43517867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Impact of Sampling Variability When Estimating the Explained Common Variance","authors":"Björn Andersson, Hao Luo","doi":"10.1177/01466216221084215","DOIUrl":"https://doi.org/10.1177/01466216221084215","url":null,"abstract":"Assessing multidimensionality of a scale or test is a staple of educational and psychological measurement. One approach to evaluate approximate unidimensionality is to fit a bifactor model where the subfactors are determined by substantive theory and estimate the explained common variance (ECV) of the general factor. The ECV says to what extent the explained variance is dominated by the general factor over the specific factors, and has been used, together with other methods and statistics, to determine if a single factor model is sufficient for analyzing a scale or test (Rodriguez et al., 2016). In addition, the individual item-ECV (I-ECV) has been used to assess approximate unidimensionality of individual items (Carnovale et al., 2021; Stucky et al., 2013). However, the ECVand I-ECVare subject to random estimation error which previous studies have not considered. Not accounting for the error in estimation can lead to conclusions regarding the dimensionality of a scale or item that are inaccurate, especially when an estimate of ECVor I-ECV is compared to a pre-specified cut-off value to evaluate unidimensionality. The objective of the present study is to derive standard errors of the estimators of ECV and I-ECV with linear confirmatory factor analysis (CFA) models to enable the assessment of random estimation error and the computation of confidence intervals for the parameters. We use Monte-Carlo simulation to assess the accuracy of the derived standard errors and evaluate the impact of sampling variability on the estimation of the ECV and I-ECV. In a bifactor model for J items, denote Xj, j 1⁄4 1, ..., J , as the observed variable and let G denote the general factor. We define the S subfactors Fs, s2f1,..., Sg, and Js as the set of indicators for each subfactor. Each observed indicator Xj is then defined by the multiple factor model (McDonald, 2013)","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 1","pages":"338 - 341"},"PeriodicalIF":1.2,"publicationDate":"2022-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42137052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Standard Errors of Kernel Equating: Accounting for Bandwidth Estimation","authors":"Kseniia Marcq, Björn Andersson","doi":"10.1177/01466216211066601","DOIUrl":"https://doi.org/10.1177/01466216211066601","url":null,"abstract":"In standardized testing, equating is used to ensure comparability of test scores across multiple test administrations. One equipercentile observed-score equating method is kernel equating, where an essential step is to obtain continuous approximations to the discrete score distributions by applying a kernel with a smoothing bandwidth parameter. When estimating the bandwidth, additional variability is introduced which is currently not accounted for when calculating the standard errors of equating. This poses a threat to the accuracy of the standard errors of equating. In this study, the asymptotic variance of the bandwidth parameter estimator is derived and a modified method for calculating the standard error of equating that accounts for the bandwidth estimation variability is introduced for the equivalent groups design. A simulation study is used to verify the derivations and confirm the accuracy of the modified method across several sample sizes and test lengths as compared to the existing method and the Monte Carlo standard error of equating estimates. The results show that the modified standard errors of equating are accurate under the considered conditions. Furthermore, the modified and the existing methods produce similar results which suggest that the bandwidth variability impact on the standard error of equating is minimal.","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 1","pages":"200 - 218"},"PeriodicalIF":1.2,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49283258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SEMsens: An R Package for Sensitivity Analysis of Structural Equation Models With the Ant Colony Optimization Algorithm.","authors":"Zuchao Shen, Walter L Leite","doi":"10.1177/01466216211063233","DOIUrl":"10.1177/01466216211063233","url":null,"abstract":"","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 2","pages":"159-161"},"PeriodicalIF":1.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908408/pdf/10.1177_01466216211063233.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10810177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predictive Fit Metrics for Item Response Models.","authors":"Benjamin A Stenhaug, Benjamin W Domingue","doi":"10.1177/01466216211066603","DOIUrl":"10.1177/01466216211066603","url":null,"abstract":"<p><p>The fit of an item response model is typically conceptualized as whether a given model could have generated the data. In this study, for an alternative view of fit, \"predictive fit,\" based on the model's ability to predict new data is advocated. The authors define two prediction tasks: \"missing responses prediction\"-where the goal is to predict an in-sample person's response to an in-sample item-and \"missing persons prediction\"-where the goal is to predict an out-of-sample person's string of responses. Based on these prediction tasks, two predictive fit metrics are derived for item response models that assess how well an estimated item response model fits the data-generating model. These metrics are based on long-run out-of-sample predictive performance (i.e., if the data-generating model produced infinite amounts of data, what is the quality of a \"model's predictions on average?\"). Simulation studies are conducted to identify the prediction-maximizing model across a variety of conditions. For example, defining prediction in terms of missing responses, greater average person ability, and greater item discrimination are all associated with the 3PL model producing relatively worse predictions, and thus lead to greater minimum sample sizes for the 3PL model. In each simulation, the prediction-maximizing model to the model selected by Akaike's information criterion, Bayesian information criterion (BIC), and likelihood ratio tests are compared. It is found that performance of these methods depends on the prediction task of interest. In general, likelihood ratio tests often select overly flexible models, while BIC selects overly parsimonious models. The authors use Programme for International Student Assessment data to demonstrate how to use cross-validation to directly estimate the predictive fit metrics in practice. The implications for item response model selection in operational settings are discussed.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 2","pages":"136-155"},"PeriodicalIF":1.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908407/pdf/10.1177_01466216211066603.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10810179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Considerations for Fitting Dynamic Bayesian Networks With Latent Variables: A Monte Carlo Study.","authors":"Ray E Reichenberg, Roy Levy, Adam Clark","doi":"10.1177/01466216211066609","DOIUrl":"https://doi.org/10.1177/01466216211066609","url":null,"abstract":"<p><p>Dynamic Bayesian networks (DBNs; Reye, 2004) are a promising tool for modeling student proficiency under rich measurement scenarios (Reichenberg, 2018). These scenarios often present assessment conditions far more complex than what is seen with more traditional assessments and require assessment arguments and psychometric models capable of integrating those complexities. Unfortunately, DBNs remain understudied and their psychometric properties relatively unknown. The current work aimed at exploring the properties of DBNs under a variety of realistic psychometric conditions. A Monte Carlo simulation study was conducted in order to evaluate parameter recovery for DBNs using maximum likelihood estimation. Manipulated factors included sample size, measurement quality, test length, the number of measurement occasions. Results suggested that measurement quality has the most prominent impact on estimation quality with more distinct performance categories yielding better estimation. From a practical perspective, parameter recovery appeared to be sufficient with samples as low as <i>N</i> = 400 as long as measurement quality was not poor and at least three items were present at each measurement occasion. Tests consisting of only a single item required exceptional measurement quality in order to adequately recover model parameters.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 2","pages":"116-135"},"PeriodicalIF":1.2,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908410/pdf/10.1177_01466216211066609.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10615071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}