PsychometrikaPub Date : 2024-12-01Epub Date: 2024-10-30DOI: 10.1007/s11336-024-10004-7
Klaas Sijtsma, Jules L Ellis, Denny Borsboom
{"title":"Rejoinder to McNeish and Mislevy: What Does Psychological Measurement Require?","authors":"Klaas Sijtsma, Jules L Ellis, Denny Borsboom","doi":"10.1007/s11336-024-10004-7","DOIUrl":"10.1007/s11336-024-10004-7","url":null,"abstract":"<p><p>In this rejoinder to McNeish (2024) and Mislevy (2024), who both responded to our focus article on the merits of the simple sum score (Sijtsma et al., 2024), we address several issues. Psychometrics education and in particular psychometricians' outreach may help researchers to use IRT models as a precursor for the responsible use of the latent variable score and the sum score. Different methods used for test and questionnaire construction often do not produce highly different results, and when they do, this may be due to an unarticulated attribute theory generating noisy data. The sum score and transformations thereof, such as normalized test scores and percentiles, may help test practitioners and their clients to better communicate results. Latent variables prove important in more advanced applications such as equating and adaptive testing where they serve as technical tools rather than communication devices. Decisions based on test results are often binary or use a rather coarse ordering of scale levels, hence, do not require a high level of granularity (but nevertheless need to be precise). A gap exists between psychology and psychometrics which is growing deeper and wider, and that needs to be bridged. Psychology and psychometrics must work together to attain this goal.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1175-1185"},"PeriodicalIF":2.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142548948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-09-01Epub Date: 2024-03-12DOI: 10.1007/s11336-024-09956-7
Jules L Ellis, Klaas Sijtsma
{"title":"Proof of Reliability Convergence to 1 at Rate of Spearman-Brown Formula for Random Test Forms and Irrespective of Item Pool Dimensionality.","authors":"Jules L Ellis, Klaas Sijtsma","doi":"10.1007/s11336-024-09956-7","DOIUrl":"10.1007/s11336-024-09956-7","url":null,"abstract":"<p><p>It is shown that the psychometric test reliability, based on any true-score model with randomly sampled items and uncorrelated errors, converges to 1 as the test length goes to infinity, with probability 1, assuming some general regularity conditions. The asymptotic rate of convergence is given by the Spearman-Brown formula, and for this it is not needed that the items are parallel, or latent unidimensional, or even finite dimensional. Simulations with the 2-parameter logistic item response theory model reveal that the reliability of short multidimensional tests can be positively biased, meaning that applying the Spearman-Brown formula in these cases would lead to overprediction of the reliability that results from lengthening a test. However, test constructors of short tests generally aim for short tests that measure just one attribute, so that the bias problem may have little practical relevance. For short unidimensional tests under the 2-parameter logistic model reliability is almost unbiased, meaning that application of the Spearman-Brown formula in these cases of greater practical utility leads to predictions that are approximately unbiased.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"774-795"},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11458731/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140112220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diagnostic Classification Models for Testlets: Methods and Theory.","authors":"Xin Xu, Guanhua Fang, Jinxin Guo, Zhiliang Ying, Susu Zhang","doi":"10.1007/s11336-024-09962-9","DOIUrl":"10.1007/s11336-024-09962-9","url":null,"abstract":"<p><p>Diagnostic classification models (DCMs) have seen wide applications in educational and psychological measurement, especially in formative assessment. DCMs in the presence of testlets have been studied in recent literature. A key ingredient in the statistical modeling and analysis of testlet-based DCMs is the superposition of two latent structures, the attribute profile and the testlet effect. This paper extends the standard testlet DINA (T-DINA) model to accommodate the potential correlation between the two latent structures. Model identifiability is studied and a set of sufficient conditions are proposed. As a byproduct, the identifiability of the standard T-DINA is also established. The proposed model is applied to a dataset from the 2015 Programme for International Student Assessment. Comparisons are made with DINA and T-DINA, showing that there is substantial improvement in terms of the goodness of fit. Simulations are conducted to assess the performance of the new method under various settings.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"851-876"},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140289617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-09-01Epub Date: 2024-06-03DOI: 10.1007/s11336-024-09977-2
Benjamin W Domingue, Klint Kanopka, Radhika Kapoor, Steffi Pohl, R Philip Chalmers, Charles Rahal, Mijke Rhemtulla
{"title":"The InterModel Vigorish as a Lens for Understanding (and Quantifying) the Value of Item Response Models for Dichotomously Coded Items.","authors":"Benjamin W Domingue, Klint Kanopka, Radhika Kapoor, Steffi Pohl, R Philip Chalmers, Charles Rahal, Mijke Rhemtulla","doi":"10.1007/s11336-024-09977-2","DOIUrl":"10.1007/s11336-024-09977-2","url":null,"abstract":"<p><p>The deployment of statistical models-such as those used in item response theory-necessitates the use of indices that are informative about the degree to which a given model is appropriate for a specific data context. We introduce the InterModel Vigorish (IMV) as an index that can be used to quantify accuracy for models of dichotomous item responses based on the improvement across two sets of predictions (i.e., predictions from two item response models or predictions from a single such model relative to prediction based on the mean). This index has a range of desirable features: It can be used for the comparison of non-nested models and its values are highly portable and generalizable. We use this fact to compare predictive performance across a variety of simulated data contexts and also demonstrate qualitative differences in behavior between the IMV and other common indices (e.g., the AIC and RMSEA). We also illustrate the utility of the IMV in empirical applications with data from 89 dichotomous item response datasets. These empirical applications help illustrate how the IMV can be used in practice and substantiate our claims regarding various aspects of model performance. These findings indicate that the IMV may be a useful indicator in psychometrics, especially as it allows for easy comparison of predictions across a variety of contexts.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1034-1054"},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-09-01Epub Date: 2024-05-28DOI: 10.1007/s11336-024-09975-4
Ke-Hai Yuan, Zhiyong Zhang, Lijuan Wang
{"title":"Signal-to-Noise Ratio in Estimating and Testing the Mediation Effect: Structural Equation Modeling versus Path Analysis with Weighted Composites.","authors":"Ke-Hai Yuan, Zhiyong Zhang, Lijuan Wang","doi":"10.1007/s11336-024-09975-4","DOIUrl":"10.1007/s11336-024-09975-4","url":null,"abstract":"<p><p>Mediation analysis plays an important role in understanding causal processes in social and behavioral sciences. While path analysis with composite scores was criticized to yield biased parameter estimates when variables contain measurement errors, recent literature has pointed out that the population values of parameters of latent-variable models are determined by the subjectively assigned scales of the latent variables. Thus, conclusions in existing studies comparing structural equation modeling (SEM) and path analysis with weighted composites (PAWC) on the accuracy and precision of the estimates of the indirect effect in mediation analysis have little validity. Instead of comparing the size on estimates of the indirect effect between SEM and PAWC, this article compares parameter estimates by signal-to-noise ratio (SNR), which does not depend on the metrics of the latent variables once the anchors of the latent variables are determined. Results show that PAWC yields greater SNR than SEM in estimating and testing the indirect effect even when measurement errors exist. In particular, path analysis via factor scores almost always yields greater SNRs than SEM. Mediation analysis with equally weighted composites (EWCs) also more likely yields greater SNRs than SEM. Consequently, PAWC is statistically more efficient and more powerful than SEM in conducting mediation analysis in empirical research. The article also further studies conditions that cause SEM to have smaller SNRs, and results indicate that the advantage of PAWC becomes more obvious when there is a strong relationship between the predictor and the mediator, whereas the size of the prediction error in the mediator adversely affects the performance of the PAWC methodology. Results of a real-data example also support the conclusions.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"974-1006"},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11458674/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141162255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-09-01Epub Date: 2024-05-28DOI: 10.1007/s11336-024-09976-3
Marco Gregori, Martijn G De Jong, Rik Pieters
{"title":"The Crosswise Model for Surveys on Sensitive Topics: A General Framework for Item Selection and Statistical Analysis.","authors":"Marco Gregori, Martijn G De Jong, Rik Pieters","doi":"10.1007/s11336-024-09976-3","DOIUrl":"10.1007/s11336-024-09976-3","url":null,"abstract":"<p><p>When surveys contain direct questions about sensitive topics, participants may not provide their true answers. Indirect question techniques incentivize truthful answers by concealing participants' responses in various ways. The Crosswise Model aims to do this by pairing a sensitive target item with a non-sensitive baseline item, and only asking participants to indicate whether their responses to the two items are the same or different. Selection of the baseline item is crucial to guarantee participants' perceived and actual privacy and to enable reliable estimates of the sensitive trait. This research makes the following contributions. First, it describes an integrated methodology to select the baseline item, based on conceptual and statistical considerations. The resulting methodology distinguishes four statistical models. Second, it proposes novel Bayesian estimation methods to implement these models. Third, it shows that the new models introduced here improve efficiency over common applications of the Crosswise Model and may relax the required statistical assumptions. These three contributions facilitate applying the methodology in a variety of settings. An empirical application on attitudes toward LGBT issues shows the potential of the Crosswise Model. An interactive app, Python and MATLAB codes support broader adoption of the model.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1007-1033"},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11458659/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141162342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-09-01DOI: 10.1007/s11336-024-10002-9
Sandip Sinharay
{"title":"Remarks from the Editor-in-Chief.","authors":"Sandip Sinharay","doi":"10.1007/s11336-024-10002-9","DOIUrl":"10.1007/s11336-024-10002-9","url":null,"abstract":"","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"745-746"},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142301122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-09-01DOI: 10.1007/s11336-024-09999-w
Chun Wang
{"title":"Correction: A Diagnostic Facet Status Model (DFSM) for Extracting Instructionally Useful Information from Diagnostic Assessment.","authors":"Chun Wang","doi":"10.1007/s11336-024-09999-w","DOIUrl":"10.1007/s11336-024-09999-w","url":null,"abstract":"","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1108"},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141983980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-09-01Epub Date: 2024-05-04DOI: 10.1007/s11336-024-09957-6
Peter F Halpin
{"title":"Differential Item Functioning via Robust Scaling.","authors":"Peter F Halpin","doi":"10.1007/s11336-024-09957-6","DOIUrl":"10.1007/s11336-024-09957-6","url":null,"abstract":"<p><p>This paper proposes a method for assessing differential item functioning (DIF) in item response theory (IRT) models. The method does not require pre-specification of anchor items, which is its main virtue. It is developed in two main steps: first by showing how DIF can be re-formulated as a problem of outlier detection in IRT-based scaling and then tackling the latter using methods from robust statistics. The proposal is a redescending M-estimator of IRT scaling parameters that is tuned to flag items with DIF at the desired asymptotic type I error rate. Theoretical results describe the efficiency of the estimator in the absence of DIF and its robustness in the presence of DIF. Simulation studies show that the proposed method compares favorably to currently available approaches for DIF detection, and a real data example illustrates its application in a research context where pre-specification of anchor items is infeasible. The focus of the paper is the two-parameter logistic model in two independent groups, with extensions to other settings considered in the conclusion.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"796-821"},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140860216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PsychometrikaPub Date : 2024-09-01Epub Date: 2024-07-04DOI: 10.1007/s11336-024-09980-7
Paul De Boeck, Michael L DeKay, Jolynn Pek
{"title":"Adventitious Error and Its Implications for Testing Relations Between Variables and for Composite Measurement Outcomes.","authors":"Paul De Boeck, Michael L DeKay, Jolynn Pek","doi":"10.1007/s11336-024-09980-7","DOIUrl":"10.1007/s11336-024-09980-7","url":null,"abstract":"<p><p>Wu and Browne (Psychometrika 80(3):571-600, 2015. https://doi.org/10.1007/s11336-015-9451-3 ; henceforth W &B) introduced the notion of adventitious error to explicitly take into account approximate goodness of fit of covariance structure models (CSMs). Adventitious error supposes that observed covariance matrices are not directly sampled from a theoretical population covariance matrix but from an operational population covariance matrix. This operational matrix is randomly distorted from the theoretical matrix due to differences in study implementations. W &B showed how adventitious error is linked to the root mean square error of approximation (RMSEA) and how the standard errors (SEs) of parameter estimates are augmented. Our contribution is to consider adventitious error as a general phenomenon and to illustrate its consequences. Using simulations, we illustrate that its impact on SEs can be generalized to pairwise relations between variables beyond the CSM context. Using derivations, we conjecture that heterogeneity of effect sizes across studies and overestimation of statistical power can both be interpreted as stemming from adventitious error. We also show that adventitious error, if it occurs, has an impact on the uncertainty of composite measurement outcomes such as factor scores and summed scores. The results of a simulation study show that the impact on measurement uncertainty is rather small although larger for factor scores than for summed scores. Adventitious error is an assumption about the data generating mechanism; the notion offers a statistical framework for understanding a broad range of phenomena, including approximate fit, varying research findings, heterogeneity of effects, and overestimates of power.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1055-1073"},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11458726/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141499657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}