Katherine E Castellano, Sandip Sinharay, Jiangang Hao, Chen Li
{"title":"An Investigation Into the Impact of Test Session Disruptions for At-Home Test Administrations.","authors":"Katherine E Castellano, Sandip Sinharay, Jiangang Hao, Chen Li","doi":"10.1177/01466216221128011","DOIUrl":"10.1177/01466216221128011","url":null,"abstract":"<p><p>In response to the closures of test centers worldwide due to the COVID-19 pandemic, several testing programs offered large-scale standardized assessments to examinees remotely. However, due to the varying quality of the performance of personal devices and internet connections, more at-home examinees likely suffered \"disruptions\" or an interruption in the connectivity to their testing session compared to typical test-center administrations. Disruptions have the potential to adversely affect examinees and lead to fairness or validity issues. The goal of this study was to investigate the extent to which disruptions impacted performance of at-home examinees using data from a large-scale admissions test. Specifically, the study involved comparing the average test scores of the disrupted examinees with those of the non-disrupted examinees after weighting the non-disrupted examinees to resemble the disrupted examinees along baseline characteristics. The results show that disruptions had a small negative impact on test scores on average. However, there was little difference in performance between the disrupted and non-disrupted examinees after removing records of the disrupted examinees who were unable to complete the test.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9679922/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40494729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applying Negative Binomial Distribution in Diagnostic Classification Models for Analyzing Count Data.","authors":"Ren Liu, Ihnwhi Heo, Haiyan Liu, Dexin Shi, Zhehan Jiang","doi":"10.1177/01466216221124604","DOIUrl":"10.1177/01466216221124604","url":null,"abstract":"<p><p>Diagnostic classification models (DCMs) have been used to classify examinees into groups based on their possession status of a set of latent traits. In addition to traditional item-based scoring approaches, examinees may be scored based on their completion of a series of small and similar tasks. Those scores are usually considered as count variables. To model count scores, this study proposes a new class of DCMs that uses the negative binomial distribution at its core. We explained the proposed model framework and demonstrated its use through an operational example. Simulation studies were conducted to evaluate the performance of the proposed model and compare it with the Poisson-based DCM.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/07/94/10.1177_01466216221124604.PMC9679925.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40494728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feri Wijayanto, Ioan Gabriel Bucur, Perry Groot, Tom Heskes
{"title":"autoRasch: An R Package to Do Semi-Automated Rasch Analysis.","authors":"Feri Wijayanto, Ioan Gabriel Bucur, Perry Groot, Tom Heskes","doi":"10.1177/01466216221125178","DOIUrl":"https://doi.org/10.1177/01466216221125178","url":null,"abstract":"<p><p>The R package autoRasch has been developed to perform a Rasch analysis in a (semi-)automated way. The automated part of the analysis is achieved by optimizing the so-called <i>in-plus-out-of-questionnaire log-likelihood</i> (IPOQ-LL) or IPOQ-LL-DIF when differential item functioning (DIF) is included. These criteria measure the quality of fit on a pre-collected survey, depending on which items are included in the final instrument. To compute these criteria, autoRasch fits the generalized partial credit model (GPCM) or the generalized partial credit model with differential item functioning (GPCM-DIF) using penalized joint maximum likelihood estimation (PJMLE). The package further allows the user to reevaluate the output of the automated method and use it as a basis for performing a manual Rasch analysis and provides standard statistics of Rasch analyses (e.g., outfit, infit, person separation reliability, and residual correlation) to support the model reevaluation.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9679921/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40494732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Outlier Detection Using t-test in Rasch IRT Equating under NEAT Design.","authors":"Chunyan Liu, Daniel Jurich","doi":"10.1177/01466216221124045","DOIUrl":"10.1177/01466216221124045","url":null,"abstract":"<p><p>In equating practice, the existence of outliers in the anchor items may deteriorate the equating accuracy and threaten the validity of test scores. Therefore, stability of the anchor item performance should be evaluated before conducting equating. This study used simulation to investigate the performance of the <i>t</i>-test method in detecting outliers and compared its performance with other outlier detection methods, including the logit difference method with 0.5 and 0.3 as the cutoff values and the robust <i>z</i> statistic with 2.7 as the cutoff value. The investigated factors included sample size, proportion of outliers, item difficulty drift direction, and group difference. Across all simulated conditions, the <i>t</i>-test method outperformed the other methods in terms of sensitivity of flagging true outliers, bias of the estimated translation constant, and the root mean square error of examinee ability estimates.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9679927/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40494730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kuan-Yu Jin, Chia-Ling Hsu, Ming Ming Chiu, Po-Hsi Chen
{"title":"Modeling Rapid Guessing Behaviors in Computer-Based Testlet Items.","authors":"Kuan-Yu Jin, Chia-Ling Hsu, Ming Ming Chiu, Po-Hsi Chen","doi":"10.1177/01466216221125177","DOIUrl":"10.1177/01466216221125177","url":null,"abstract":"<p><p>In traditional test models, test items are independent, and test-takers slowly and thoughtfully respond to each test item. However, some test items have a common stimulus (dependent test items in a testlet), and sometimes test-takers lack motivation, knowledge, or time (speededness), so they perform rapid guessing (RG). Ignoring the dependence in responses to testlet items can negatively bias standard errors of measurement, and ignoring RG by fitting a simpler item response theory (IRT) model can bias the results. Because computer-based testing captures response times on testlet responses, we propose a mixture testlet IRT model with item responses and response time to model RG behaviors in computer-based testlet items. Two simulation studies with Markov chain Monte Carlo estimation using the JAGS program showed (a) good recovery of the item and person parameters in this new model and (b) the harmful consequences of ignoring RG (biased parameter estimates: overestimated item difficulties, underestimated time intensities, underestimated respondent latent speed parameters, and overestimated precision of respondent latent estimates). The application of IRT models with and without RG to data from a computer-based language test showed parameter differences resembling those in the simulations.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9679923/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40494726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Metropolis-Hastings Robbins-Monro Algorithm for High-Dimensional Diagnostic Classification Models.","authors":"Chen-Wei Liu","doi":"10.1177/01466216221123981","DOIUrl":"10.1177/01466216221123981","url":null,"abstract":"<p><p>The expectation-maximization (EM) algorithm is a commonly used technique for the parameter estimation of the diagnostic classification models (DCMs) with a prespecified Q-matrix; however, it requires <i>O</i>(2 <sup><i>K</i></sup> ) calculations in its expectation-step, which significantly slows down the computation when the number of attributes, <i>K</i>, is large. This study proposes an efficient Metropolis-Hastings Robbins-Monro (eMHRM) algorithm, needing only <i>O</i>(<i>K</i> + 1) calculations in the Monte Carlo expectation step. Furthermore, the item parameters and structural parameters are approximated via the Robbins-Monro algorithm, which does not require time-consuming nonlinear optimization procedures. A series of simulation studies were conducted to compare the eMHRM with the EM and a Metropolis-Hastings (MH) algorithm regarding the parameter recovery and execution time. The outcomes presented in this article reveal that the eMHRM is much more computationally efficient than the EM and MH, and it tends to produce better estimates than the EM when <i>K</i> is large, suggesting that the eMHRM is a promising parameter estimation method for high-dimensional DCMs.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9574082/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40656644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Item Selection With Collaborative Filtering in On-The-Fly Multistage Adaptive Testing.","authors":"Jiaying Xiao, Okan Bulut","doi":"10.1177/01466216221124089","DOIUrl":"https://doi.org/10.1177/01466216221124089","url":null,"abstract":"<p><p>An important design feature in the implementation of both computerized adaptive testing and multistage adaptive testing is the use of an appropriate method for item selection. The item selection method is expected to select the most optimal items depending on the examinees' ability level while considering other design features (e.g., item exposure and item bank utilization). This study introduced collaborative filtering (CF) as a new method for item selection in the <i>on-the-fly assembled multistage adaptive testing</i> framework. The user-based CF (UBCF) and item-based CF (IBCF) methods were compared to the maximum Fisher information method based on the accuracy of ability estimation, item exposure rates, and item bank utilization under different test conditions (e.g., item bank size, test length, and the sparseness of training data). The simulation results indicated that the UBCF method outperformed the traditional item selection methods regarding measurement accuracy. Also, the IBCF method showed the most superior performance in terms of item bank utilization. Limitations of the current study and the directions for future research are discussed.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/09/ba/10.1177_01466216221124089.PMC9574085.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40656645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attenuation-Corrected Estimators of Reliability.","authors":"Jari Metsämuuronen","doi":"10.1177/01466216221108131","DOIUrl":"https://doi.org/10.1177/01466216221108131","url":null,"abstract":"<p><p>The estimates of reliability are usually attenuated and deflated because the item-score correlation ( <math> <mrow><msub><mi>ρ</mi> <mrow><mi>g</mi> <mi>X</mi></mrow> </msub> </mrow> </math> , <i>Rit</i>) embedded in the most widely used estimators is affected by several sources of mechanical error in the estimation. Empirical examples show that, in some types of datasets, the estimates by traditional alpha may be deflated by 0.40-0.60 units of reliability and those by maximal reliability by 0.40 units of reliability. This article proposes a new kind of estimator of correlation: attenuation-corrected correlation (<i>R</i> <sub><i>AC</i></sub> ): the proportion of observed correlation with the maximal possible correlation reachable by the given item and score. By replacing <math> <mrow><msub><mi>ρ</mi> <mrow><mi>g</mi> <mi>X</mi></mrow> </msub> </mrow> </math> with <i>R</i> <sub><i>AC</i></sub> in known formulas of estimators of reliability, we get attenuation-corrected alpha, theta, omega, and maximal reliability which all belong to a family of so-called deflation-corrected estimators of reliability.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/66/7b/10.1177_01466216221108131.PMC9574086.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40573822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Empirical Identification Issue of the Bifactor Item Response Theory Model.","authors":"Wenya Chen, Ken A Fujimoto","doi":"10.1177/01466216221108133","DOIUrl":"10.1177/01466216221108133","url":null,"abstract":"<p><p>Using the bifactor item response theory model to analyze data arising from educational and psychological studies has gained popularity over the years. Unfortunately, using this model in practice comes with challenges. One such challenge is an empirical identification issue that is seldom discussed in the literature, and its impact on the estimates of the bifactor model's parameters has not been demonstrated. This issue occurs when an item's discriminations on the general and specific dimensions are approximately equal (i.e., the within-item discriminations are similar in strength), leading to difficulties in obtaining unique estimates for those discriminations. We conducted three simulation studies to demonstrate that within-item discriminations being similar in strength creates problems in estimation stability. The results suggest that a large sample could alleviate but not resolve the problems, at least when considering sample sizes up to 4,000. When the discriminations within items were made clearly different, the estimates of these discriminations were more consistent across the data replicates than that observed when the discriminations within the items were similar. The results also show that the similarity of an item's discriminatory magnitudes on different dimensions has direct implications on the sample size needed in order to consistently obtain accurate parameter estimates. Although our goal was to provide evidence of the empirical identification issue, the study further reveals that the extent of similarity of within-item discriminations, the magnitude of discriminations, and how well the items are targeted to the respondents also play factors in the estimation of the bifactor model's parameters.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9574084/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40656647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Flexible Item Response Models for Count Data: The Count Thresholds Model.","authors":"Gerhard Tutz","doi":"10.1177/01466216221108124","DOIUrl":"10.1177/01466216221108124","url":null,"abstract":"<p><p>A new item response theory model for count data is introduced. In contrast to models in common use, it does not assume a fixed distribution for the responses as, for example, the Poisson count model and extensions do. The distribution of responses is determined by difficulty functions which reflect the characteristics of items in a flexible way. Sparse parameterizations are obtained by choosing fixed parametric difficulty functions, more general versions use an approximation by basis functions. The model can be seen as constructed from binary response models as the Rasch model or the normal-ogive model to which it reduces if responses are dichotomized. It is demonstrated that the model competes well with advanced count data models. Simulations demonstrate that parameters and response distributions are recovered well. An application shows the flexibility of the model to account for strongly varying distributions of responses.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9574081/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40573824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}