{"title":"Exploratory Graph Analysis for Factor Retention: Simulation Results for Continuous and Binary Data.","authors":"Tim Cosemans, Yves Rosseel, Sarah Gelper","doi":"10.1177/00131644211059089","DOIUrl":"https://doi.org/10.1177/00131644211059089","url":null,"abstract":"<p><p>Exploratory graph analysis (EGA) is a commonly applied technique intended to help social scientists discover latent variables. Yet, the results can be influenced by the methodological decisions the researcher makes along the way. In this article, we focus on the choice regarding the number of factors to retain: We compare the performance of the recently developed EGA with various traditional factor retention criteria. We use both continuous and binary data, as evidence regarding the accuracy of such criteria in the latter case is scarce. Simulation results, based on scenarios resulting from varying sample size, communalities from major factors, interfactor correlations, skewness, and correlation measure, show that EGA outperforms the traditional factor retention criteria considered in most cases in terms of bias and accuracy. In addition, we show that factor retention decisions for binary data are preferably made using Pearson, instead of tetrachoric, correlations, which is contradictory to popular belief.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 5","pages":"880-910"},"PeriodicalIF":2.7,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9386885/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40626317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Effect of Latent and Error Non-Normality on Measures of Fit in Structural Equation Modeling.","authors":"Lisa J Jobst, Max Auerswald, Morten Moshagen","doi":"10.1177/00131644211046201","DOIUrl":"10.1177/00131644211046201","url":null,"abstract":"<p><p>Prior studies investigating the effects of non-normality in structural equation modeling typically induced non-normality in the indicator variables. This procedure neglects the factor analytic structure of the data, which is defined as the sum of latent variables and errors, so it is unclear whether previous results hold if the source of non-normality is considered. We conducted a Monte Carlo simulation manipulating the underlying multivariate distribution to assess the effect of the source of non-normality (latent, error, and marginal conditions with either multivariate normal or non-normal marginal distributions) on different measures of fit (empirical rejection rates for the likelihood-ratio model test statistic, the root mean square error of approximation, the standardized root mean square residual, and the comparative fit index). We considered different estimation methods (maximum likelihood, generalized least squares, and (un)modified asymptotically distribution-free), sample sizes, and the extent of non-normality in correctly specified and misspecified models to investigate their performance. The results show that all measures of fit were affected by the source of non-normality but with varying patterns for the analyzed estimation methods.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 5","pages":"911-937"},"PeriodicalIF":2.7,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9386883/pdf/10.1177_00131644211046201.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40628155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of Change Point Analysis of Response Time Data to Detect Test Speededness.","authors":"Ying Cheng, Can Shao","doi":"10.1177/00131644211046392","DOIUrl":"10.1177/00131644211046392","url":null,"abstract":"<p><p>Computer-based and web-based testing have become increasingly popular in recent years. Their popularity has dramatically expanded the availability of response time data. Compared to the conventional item response data that are often dichotomous or polytomous, response time has the advantage of being continuous and can be collected in an unobstrusive manner. It therefore has great potential to improve many measurement activities. In this paper, we propose a change point analysis (CPA) procedure to detect test speededness using response time data. Specifically, two test statistics based on CPA, the likelihood ratio test and Wald test, are proposed to detect test speededness. A simulation study has been conducted to evaluate the performance of the proposed CPA procedure, as well as the use of asymptotic and empirical critical values. Results indicate that the proposed procedure leads to high power in detecting test speededness, while keeping the false positive rate under control, even when simplistic and liberal critical values are used. Accuracy of the estimation of the actual change point, however, is highly dependent on the true change point. A real data example is also provided to illustrate the utility of the proposed procedure and its contrast to the response-only procedure. Implications of the findings are discussed at the end.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 5","pages":"1031-1062"},"PeriodicalIF":2.7,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9386879/pdf/10.1177_00131644211046392.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40626318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Examining the Robustness of the Graded Response and 2-Parameter Logistic Models to Violations of Construct Normality.","authors":"Patrick D Manapat, Michael C Edwards","doi":"10.1177/00131644211063453","DOIUrl":"10.1177/00131644211063453","url":null,"abstract":"<p><p>When fitting unidimensional item response theory (IRT) models, the population distribution of the latent trait (θ) is often assumed to be normally distributed. However, some psychological theories would suggest a nonnormal θ. For example, some clinical traits (e.g., alcoholism, depression) are believed to follow a positively skewed distribution where the construct is low for most people, medium for some, and high for few. Failure to account for nonnormality may compromise the validity of inferences and conclusions. Although corrections have been developed to account for nonnormality, these methods can be computationally intensive and have not yet been widely adopted. Previous research has recommended implementing nonnormality corrections when θ is not \"approximately normal.\" This research focused on examining how far θ can deviate from normal before the normality assumption becomes untenable. Specifically, our goal was to identify the type(s) and degree(s) of nonnormality that result in unacceptable parameter recovery for the graded response model (GRM) and 2-parameter logistic model (2PLM).</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 5","pages":"967-988"},"PeriodicalIF":2.7,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9386882/pdf/10.1177_00131644211063453.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40626322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying Problematic Item Characteristics With Small Samples Using Mokken Scale Analysis.","authors":"Stefanie A Wind","doi":"10.1177/00131644211045347","DOIUrl":"https://doi.org/10.1177/00131644211045347","url":null,"abstract":"<p><p>Researchers frequently use Mokken scale analysis (MSA), which is a nonparametric approach to item response theory, when they have relatively small samples of examinees. Researchers have provided some guidance regarding the minimum sample size for applications of MSA under various conditions. However, these studies have not focused on item-level measurement problems, such as violations of monotonicity or invariant item ordering (IIO). Moreover, these studies have focused on problems that occur for a complete sample of examinees. The current study uses a simulation study to consider the sensitivity of MSA item analysis procedures to problematic item characteristics that occur within limited ranges of the latent variable. Results generally support the use of MSA with small samples (<i>N</i> around 100 examinees) as long as multiple indicators of item quality are considered.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 4","pages":"747-756"},"PeriodicalIF":2.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9228692/pdf/10.1177_00131644211045347.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10290041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting Differential Rater Functioning in Severity and Centrality: The Dual DRF Facets Model.","authors":"Kuan-Yu Jin, Thomas Eckes","doi":"10.1177/00131644211043207","DOIUrl":"https://doi.org/10.1177/00131644211043207","url":null,"abstract":"<p><p>Performance assessments heavily rely on human ratings. These ratings are typically subject to various forms of error and bias, threatening the assessment outcomes' validity and fairness. Differential rater functioning (DRF) is a special kind of threat to fairness manifesting itself in unwanted interactions between raters and performance- or construct-irrelevant factors (e.g., examinee gender, rater experience, or time of rating). Most DRF studies have focused on whether raters show differential severity toward known groups of examinees. This study expands the DRF framework and investigates the more complex case of dual DRF effects, where DRF is simultaneously present in rater severity and centrality. Adopting a facets modeling approach, we propose the dual DRF model (DDRFM) for detecting and measuring these effects. In two simulation studies, we found that dual DRF effects (a) negatively affected measurement quality and (b) can reliably be detected and compensated under the DDRFM. Using sample data from a large-scale writing assessment (<i>N</i> = 1,323), we demonstrate the practical measurement consequences of the dual DRF effects. Findings have implications for researchers and practitioners assessing the psychometric quality of ratings.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 4","pages":"757-781"},"PeriodicalIF":2.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9228693/pdf/10.1177_00131644211043207.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10271624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Response Vector for Mastery Method of Standard Setting.","authors":"Dimiter M Dimitrov","doi":"10.1177/00131644211032388","DOIUrl":"10.1177/00131644211032388","url":null,"abstract":"<p><p>Proposed is a new method of standard setting referred to as response vector for mastery (RVM) method. Under the RVM method, the task of panelists that participate in the standard setting process does not involve conceptualization of a borderline examinee and probability judgments as it is the case with the Angoff and bookmark methods. Also, the RVM-based computation of a cut-score is not based on a single item (e.g., marked in an ordered item booklet) but, instead, on a response vector (1/0 scores) on items and their parameters calibrated in item response theory or under the recently developed <i>D</i>-scoring method. Illustrations with hypothetical and real-data scenarios of standard setting are provided and methodological aspects of the RVM method are discussed.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 4","pages":"719-746"},"PeriodicalIF":2.1,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9228699/pdf/10.1177_00131644211032388.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10271623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DIF Detection With Zero-Inflation Under the Factor Mixture Modeling Framework.","authors":"Sooyong Lee, Suhwa Han, Seung W Choi","doi":"10.1177/00131644211028995","DOIUrl":"10.1177/00131644211028995","url":null,"abstract":"<p><p>Response data containing an excessive number of zeros are referred to as zero-inflated data. When differential item functioning (DIF) detection is of interest, zero-inflation can attenuate DIF effects in the total sample and lead to underdetection of DIF items. The current study presents a DIF detection procedure for response data with excess zeros due to the existence of unobserved heterogeneous subgroups. The suggested procedure utilizes the factor mixture modeling (FMM) with MIMIC (multiple-indicator multiple-cause) to address the compromised DIF detection power via the estimation of latent classes. A Monte Carlo simulation was conducted to evaluate the suggested procedure in comparison to the well-known likelihood ratio (LR) DIF test. Our simulation study results indicated the superiority of FMM over the LR DIF test in terms of detection power and illustrated the importance of accounting for latent heterogeneity in zero-inflated data. The empirical data analysis results further supported the use of FMM by flagging additional DIF items over and above the LR test.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 4","pages":"678-704"},"PeriodicalIF":2.1,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9228697/pdf/10.1177_00131644211028995.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10290044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid Threshold-Based Sequential Procedures for Detecting Compromised Items in a Computerized Adaptive Testing Licensure Exam.","authors":"Chansoon Lee, Hong Qian","doi":"10.1177/00131644211023868","DOIUrl":"https://doi.org/10.1177/00131644211023868","url":null,"abstract":"<p><p>Using classical test theory and item response theory, this study applied sequential procedures to a real operational item pool in a variable-length computerized adaptive testing (CAT) to detect items whose security may be compromised. Moreover, this study proposed a hybrid threshold approach to improve the detection power of the sequential procedure while controlling the Type I error rate. The hybrid threshold approach uses a local threshold for each item in an early stage of the CAT administration, and then it uses the global threshold in the decision-making stage. Applying various simulation factors, a series of simulation studies examined which factors contribute significantly to the power rate and lag time of the procedure. In addition to the simulation study, a case study investigated whether the procedures are applicable to the real item pool administered in CAT and can identify potentially compromised items in the pool. This research found that the increment of probability of a correct answer (<i>p</i>-increment) was the simulation factor most important to the sequential procedures' ability to detect compromised items. This study also found that the local threshold approach improved power rates and shortened lag times when the <i>p</i>-increment was small. The findings of this study could help practitioners implement the sequential procedures using the hybrid threshold approach in real-time CAT administration.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 4","pages":"782-810"},"PeriodicalIF":2.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/00131644211023868","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10272075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhehan Jiang, Mark Raymond, Christine DiStefano, Dexin Shi, Ren Liu, Junhua Sun
{"title":"A Monte Carlo Study of Confidence Interval Methods for Generalizability Coefficient.","authors":"Zhehan Jiang, Mark Raymond, Christine DiStefano, Dexin Shi, Ren Liu, Junhua Sun","doi":"10.1177/00131644211033899","DOIUrl":"https://doi.org/10.1177/00131644211033899","url":null,"abstract":"<p><p>Computing confidence intervals around generalizability coefficients has long been a challenging task in generalizability theory. This is a serious practical problem because generalizability coefficients are often computed from designs where some facets have small sample sizes, and researchers have little guide regarding the trustworthiness of the coefficients. As generalizability theory can be framed to a linear mixed-effect model (LMM), bootstrap and simulation techniques from LMM paradigm can be used to construct the confidence intervals. The purpose of this research is to examine four different LMM-based methods for computing the confidence intervals that have been proposed and to determine their accuracy under six simulated conditions based on the type of test scores (normal, dichotomous, and polytomous data) and data measurement design (<i>p</i>×<i>i</i>×<i>r</i> and <i>p</i>× [<i>i:r</i>]). A bootstrap technique called \"parametric methods with spherical random effects\" consistently produced more accurate confidence intervals than the three other LMM-based methods. Furthermore, the selected technique was compared with model-based approach to investigate the performance at the levels of variance components via the second simulation study, where the numbers of examines, raters, and items were varied. We conclude with the recommendation generalizability coefficients, the confidence interval should accompany the point estimate.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 4","pages":"705-718"},"PeriodicalIF":2.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9228698/pdf/10.1177_00131644211033899.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10290038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}