{"title":"The Effect of Latent and Error Non-Normality on Measures of Fit in Structural Equation Modeling.","authors":"Lisa J Jobst, Max Auerswald, Morten Moshagen","doi":"10.1177/00131644211046201","DOIUrl":"10.1177/00131644211046201","url":null,"abstract":"<p><p>Prior studies investigating the effects of non-normality in structural equation modeling typically induced non-normality in the indicator variables. This procedure neglects the factor analytic structure of the data, which is defined as the sum of latent variables and errors, so it is unclear whether previous results hold if the source of non-normality is considered. We conducted a Monte Carlo simulation manipulating the underlying multivariate distribution to assess the effect of the source of non-normality (latent, error, and marginal conditions with either multivariate normal or non-normal marginal distributions) on different measures of fit (empirical rejection rates for the likelihood-ratio model test statistic, the root mean square error of approximation, the standardized root mean square residual, and the comparative fit index). We considered different estimation methods (maximum likelihood, generalized least squares, and (un)modified asymptotically distribution-free), sample sizes, and the extent of non-normality in correctly specified and misspecified models to investigate their performance. The results show that all measures of fit were affected by the source of non-normality but with varying patterns for the analyzed estimation methods.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 5","pages":"911-937"},"PeriodicalIF":2.7,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9386883/pdf/10.1177_00131644211046201.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40628155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of Change Point Analysis of Response Time Data to Detect Test Speededness.","authors":"Ying Cheng, Can Shao","doi":"10.1177/00131644211046392","DOIUrl":"10.1177/00131644211046392","url":null,"abstract":"<p><p>Computer-based and web-based testing have become increasingly popular in recent years. Their popularity has dramatically expanded the availability of response time data. Compared to the conventional item response data that are often dichotomous or polytomous, response time has the advantage of being continuous and can be collected in an unobstrusive manner. It therefore has great potential to improve many measurement activities. In this paper, we propose a change point analysis (CPA) procedure to detect test speededness using response time data. Specifically, two test statistics based on CPA, the likelihood ratio test and Wald test, are proposed to detect test speededness. A simulation study has been conducted to evaluate the performance of the proposed CPA procedure, as well as the use of asymptotic and empirical critical values. Results indicate that the proposed procedure leads to high power in detecting test speededness, while keeping the false positive rate under control, even when simplistic and liberal critical values are used. Accuracy of the estimation of the actual change point, however, is highly dependent on the true change point. A real data example is also provided to illustrate the utility of the proposed procedure and its contrast to the response-only procedure. Implications of the findings are discussed at the end.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 5","pages":"1031-1062"},"PeriodicalIF":2.7,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9386879/pdf/10.1177_00131644211046392.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40626318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design Effect in Multilevel Settings: A Commentary on a Latent Variable Modeling Procedure for Its Evaluation.","authors":"Tenko Raykov, Christine DiStefano","doi":"10.1177/00131644211019447","DOIUrl":"10.1177/00131644211019447","url":null,"abstract":"<p><p>A latent variable modeling-based procedure is discussed that permits to readily point and interval estimate the design effect index in multilevel settings using widely circulated software. The method provides useful information about the relationship of important parameter standard errors when accounting for clustering effects relative to conducting single-level analyses. The approach can also be employed as an addendum to point and interval estimation of the intraclass correlation coefficient in empirical research. The discussed procedure makes it easily possible to evaluate the design effect in two-level studies by utilizing the popular latent variable modeling methodology and is illustrated with an example.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 5","pages":"1020-1030"},"PeriodicalIF":2.7,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/00131644211019447","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40626319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Examining the Robustness of the Graded Response and 2-Parameter Logistic Models to Violations of Construct Normality.","authors":"Patrick D Manapat, Michael C Edwards","doi":"10.1177/00131644211063453","DOIUrl":"10.1177/00131644211063453","url":null,"abstract":"<p><p>When fitting unidimensional item response theory (IRT) models, the population distribution of the latent trait (θ) is often assumed to be normally distributed. However, some psychological theories would suggest a nonnormal θ. For example, some clinical traits (e.g., alcoholism, depression) are believed to follow a positively skewed distribution where the construct is low for most people, medium for some, and high for few. Failure to account for nonnormality may compromise the validity of inferences and conclusions. Although corrections have been developed to account for nonnormality, these methods can be computationally intensive and have not yet been widely adopted. Previous research has recommended implementing nonnormality corrections when θ is not \"approximately normal.\" This research focused on examining how far θ can deviate from normal before the normality assumption becomes untenable. Specifically, our goal was to identify the type(s) and degree(s) of nonnormality that result in unacceptable parameter recovery for the graded response model (GRM) and 2-parameter logistic model (2PLM).</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 5","pages":"967-988"},"PeriodicalIF":2.7,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9386882/pdf/10.1177_00131644211063453.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40626322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying Problematic Item Characteristics With Small Samples Using Mokken Scale Analysis.","authors":"Stefanie A Wind","doi":"10.1177/00131644211045347","DOIUrl":"https://doi.org/10.1177/00131644211045347","url":null,"abstract":"<p><p>Researchers frequently use Mokken scale analysis (MSA), which is a nonparametric approach to item response theory, when they have relatively small samples of examinees. Researchers have provided some guidance regarding the minimum sample size for applications of MSA under various conditions. However, these studies have not focused on item-level measurement problems, such as violations of monotonicity or invariant item ordering (IIO). Moreover, these studies have focused on problems that occur for a complete sample of examinees. The current study uses a simulation study to consider the sensitivity of MSA item analysis procedures to problematic item characteristics that occur within limited ranges of the latent variable. Results generally support the use of MSA with small samples (<i>N</i> around 100 examinees) as long as multiple indicators of item quality are considered.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 4","pages":"747-756"},"PeriodicalIF":2.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9228692/pdf/10.1177_00131644211045347.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10290041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting Differential Rater Functioning in Severity and Centrality: The Dual DRF Facets Model.","authors":"Kuan-Yu Jin, Thomas Eckes","doi":"10.1177/00131644211043207","DOIUrl":"https://doi.org/10.1177/00131644211043207","url":null,"abstract":"<p><p>Performance assessments heavily rely on human ratings. These ratings are typically subject to various forms of error and bias, threatening the assessment outcomes' validity and fairness. Differential rater functioning (DRF) is a special kind of threat to fairness manifesting itself in unwanted interactions between raters and performance- or construct-irrelevant factors (e.g., examinee gender, rater experience, or time of rating). Most DRF studies have focused on whether raters show differential severity toward known groups of examinees. This study expands the DRF framework and investigates the more complex case of dual DRF effects, where DRF is simultaneously present in rater severity and centrality. Adopting a facets modeling approach, we propose the dual DRF model (DDRFM) for detecting and measuring these effects. In two simulation studies, we found that dual DRF effects (a) negatively affected measurement quality and (b) can reliably be detected and compensated under the DDRFM. Using sample data from a large-scale writing assessment (<i>N</i> = 1,323), we demonstrate the practical measurement consequences of the dual DRF effects. Findings have implications for researchers and practitioners assessing the psychometric quality of ratings.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 4","pages":"757-781"},"PeriodicalIF":2.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9228693/pdf/10.1177_00131644211043207.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10271624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Response Vector for Mastery Method of Standard Setting.","authors":"Dimiter M Dimitrov","doi":"10.1177/00131644211032388","DOIUrl":"10.1177/00131644211032388","url":null,"abstract":"<p><p>Proposed is a new method of standard setting referred to as response vector for mastery (RVM) method. Under the RVM method, the task of panelists that participate in the standard setting process does not involve conceptualization of a borderline examinee and probability judgments as it is the case with the Angoff and bookmark methods. Also, the RVM-based computation of a cut-score is not based on a single item (e.g., marked in an ordered item booklet) but, instead, on a response vector (1/0 scores) on items and their parameters calibrated in item response theory or under the recently developed <i>D</i>-scoring method. Illustrations with hypothetical and real-data scenarios of standard setting are provided and methodological aspects of the RVM method are discussed.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 4","pages":"719-746"},"PeriodicalIF":2.1,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9228699/pdf/10.1177_00131644211032388.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10271623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DIF Detection With Zero-Inflation Under the Factor Mixture Modeling Framework.","authors":"Sooyong Lee, Suhwa Han, Seung W Choi","doi":"10.1177/00131644211028995","DOIUrl":"10.1177/00131644211028995","url":null,"abstract":"<p><p>Response data containing an excessive number of zeros are referred to as zero-inflated data. When differential item functioning (DIF) detection is of interest, zero-inflation can attenuate DIF effects in the total sample and lead to underdetection of DIF items. The current study presents a DIF detection procedure for response data with excess zeros due to the existence of unobserved heterogeneous subgroups. The suggested procedure utilizes the factor mixture modeling (FMM) with MIMIC (multiple-indicator multiple-cause) to address the compromised DIF detection power via the estimation of latent classes. A Monte Carlo simulation was conducted to evaluate the suggested procedure in comparison to the well-known likelihood ratio (LR) DIF test. Our simulation study results indicated the superiority of FMM over the LR DIF test in terms of detection power and illustrated the importance of accounting for latent heterogeneity in zero-inflated data. The empirical data analysis results further supported the use of FMM by flagging additional DIF items over and above the LR test.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 4","pages":"678-704"},"PeriodicalIF":2.1,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9228697/pdf/10.1177_00131644211028995.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10290044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extended Multivariate Generalizability Theory With Complex Design Structures.","authors":"Robert L Brennan, Stella Y Kim, Won-Chan Lee","doi":"10.1177/00131644211049746","DOIUrl":"https://doi.org/10.1177/00131644211049746","url":null,"abstract":"<p><p>This article extends multivariate generalizability theory (MGT) to tests with different random-effects designs for each level of a fixed facet. There are numerous situations in which the design of a test and the resulting data structure are not definable by a single design. One example is mixed-format tests that are composed of multiple-choice and free-response items, with the latter involving variability attributable to both items and raters. In this case, two distinct designs are needed to fully characterize the design and capture potential sources of error associated with each item format. Another example involves tests containing both testlets and one or more stand-alone sets of items. Testlet effects need to be taken into account for the testlet-based items, but not the stand-alone sets of items. This article presents an extension of MGT that faithfully models such complex test designs, along with two real-data examples. Among other things, these examples illustrate that estimates of error variance, error-tolerance ratios, and reliability-like coefficients can be biased if there is a mismatch between the user-specified universe of generalization and the complex nature of the test.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 4","pages":"617-642"},"PeriodicalIF":2.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9228696/pdf/10.1177_00131644211049746.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10290043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid Threshold-Based Sequential Procedures for Detecting Compromised Items in a Computerized Adaptive Testing Licensure Exam.","authors":"Chansoon Lee, Hong Qian","doi":"10.1177/00131644211023868","DOIUrl":"https://doi.org/10.1177/00131644211023868","url":null,"abstract":"<p><p>Using classical test theory and item response theory, this study applied sequential procedures to a real operational item pool in a variable-length computerized adaptive testing (CAT) to detect items whose security may be compromised. Moreover, this study proposed a hybrid threshold approach to improve the detection power of the sequential procedure while controlling the Type I error rate. The hybrid threshold approach uses a local threshold for each item in an early stage of the CAT administration, and then it uses the global threshold in the decision-making stage. Applying various simulation factors, a series of simulation studies examined which factors contribute significantly to the power rate and lag time of the procedure. In addition to the simulation study, a case study investigated whether the procedures are applicable to the real item pool administered in CAT and can identify potentially compromised items in the pool. This research found that the increment of probability of a correct answer (<i>p</i>-increment) was the simulation factor most important to the sequential procedures' ability to detect compromised items. This study also found that the local threshold approach improved power rates and shortened lag times when the <i>p</i>-increment was small. The findings of this study could help practitioners implement the sequential procedures using the hybrid threshold approach in real-time CAT administration.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"82 4","pages":"782-810"},"PeriodicalIF":2.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/00131644211023868","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10272075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}