{"title":"AllTestSim: Comprehensive Software Tool for Simulating Fixed-Form, Linear-on-the-Fly, Multistage, and Computerized Adaptive Testing.","authors":"Kyung Chris T Han","doi":"10.1177/01466216261449756","DOIUrl":"https://doi.org/10.1177/01466216261449756","url":null,"abstract":"","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216261449756"},"PeriodicalIF":1.2,"publicationDate":"2026-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13136228/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147844606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ana Ćosić Pilepić, Tamara Mohorić, Vladimir Takšić
{"title":"Multidimensional Polytomous DIF Detection Methods - A Monte Carlo Simulation Study.","authors":"Ana Ćosić Pilepić, Tamara Mohorić, Vladimir Takšić","doi":"10.1177/01466216261446320","DOIUrl":"https://doi.org/10.1177/01466216261446320","url":null,"abstract":"<p><p>The study compared the effectiveness of four methods for detecting differential item functioning (DIF) in polytomous multidimensional data with a simple structure: the item response theory likelihood ratio test (IRT-LR), two ordinal logistic regression approaches (using raw scores vs. latent trait estimates as the matching variable), and the multidimensional MIMIC-interaction method. Data were generated under a two-dimensional graded response model with 28 five-category items. Simulation conditions manipulated DIF type (uniform, nonuniform), DIF magnitude (0, 0.3, 0.6), group size ratio (1:1, 3:1), latent trait correlation (ρ = 0, 0.5), and the presence of group impact, yielding 40 conditions with 100 replications each. Across conditions, IRT-LR and both logistic regression approaches generally maintained Type I error within acceptable limits, whereas the MIMIC-interaction model showed inflated Type I error in the presence of impact. All methods demonstrated high power for moderate uniform DIF, but detection rates declined substantially for low DIF and for nonuniform DIF. Logistic regression with latent trait estimates showed the most stable overall performance, combining adequate Type I error control with comparatively high power across conditions. Logistic regression with raw scores demonstrated relatively stronger performance for moderate nonuniform DIF. In contrast, IRT-LR exhibited lower power despite conservative Type I error control. Results suggest that regression-based approaches, particularly logistic regression using latent trait estimates, provide robust performance for DIF detection in multidimensional polytomous assessments under simple structure.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216261446320"},"PeriodicalIF":1.2,"publicationDate":"2026-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13111547/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147785623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hasan Djidu, Heri Retnawati, Samsul Hadi, Haryanto
{"title":"<i>projectLSA</i>: A Shiny Application for Integrated Latent Structure Analysis.","authors":"Hasan Djidu, Heri Retnawati, Samsul Hadi, Haryanto","doi":"10.1177/01466216261446305","DOIUrl":"https://doi.org/10.1177/01466216261446305","url":null,"abstract":"<p><p>Latent structure analysis methods, including latent profile analysis (LPA), latent class analysis (LCA), item response theory (IRT), exploratory factor analysis (EFA), and confirmatory factor analysis (CFA), are widely used in psychological and educational research to model unobserved constructs and identify heterogeneity across individuals. However, applying these methods often requires advanced statistical expertise and the use of multiple specialized software packages with different workflows, which can limit accessibility and increase analytical complexity. This paper introduces <i>projectLSA</i>, a <i>Shiny</i>-based application designed to provide an integrated and user-friendly platform for conducting latent structure analyses. The application enables users to upload data, specify models, estimate parameters, compare model fit using standard indices, and visualize results within a single interface. By integrating several established R packages, <i>projectLSA</i> supports a unified analytical workflow without requiring users to write code. The practical utility of the application is illustrated using built-in simulated datasets that support multiple analytical procedures, including LPA, LCA, IRT, EFA, and CFA. These examples demonstrate how users can estimate, compare, and interpret models efficiently within a consistent workflow. Overall, <i>projectLSA</i> enhances accessibility, consistency, and efficiency in latent structure analysis by reducing technical barriers and supporting interactive and reproducible data analysis.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216261446305"},"PeriodicalIF":1.2,"publicationDate":"2026-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13095993/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147785326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inference for Disattenuated Correlations.","authors":"Jonas Moss","doi":"10.1177/01466216261440511","DOIUrl":"https://doi.org/10.1177/01466216261440511","url":null,"abstract":"<p><p>When only summary statistics from published studies are available, the Hunter-Schmidt interval is the standard tool for inference on Spearman's disattenuated correlation, but it treats reliability estimates as known constants and ignores their sampling variability. We derive a simple delta method variance that accounts for the uncertainty of all estimates while requiring nothing beyond the summaries already at hand. Under bivariate normality of scores and coefficient alpha from a normal parallel model, the corrected interval is asymptotically valid. In simulations it achieves coverage near nominal, while Hunter-Schmidt can undercover substantially when reliability is imprecisely estimated.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216261440511"},"PeriodicalIF":1.2,"publicationDate":"2026-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13035683/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147595635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rise of the Machine: Detecting Aberrant Response Patterns in Survey Instruments Using Autoencoder.","authors":"Cody Ding","doi":"10.1177/01466216261425242","DOIUrl":"10.1177/01466216261425242","url":null,"abstract":"<p><p>Survey questionnaires are essential tools in psychological and educational research, as the data they gather directly influence research conclusions and policy decisions. A major challenge in ensuring data quality is identifying aberrant response patterns that can jeopardize research outcomes, as they may introduce errors into subsequent analyses, potentially resulting in flawed theoretical conclusions and misguided practical applications. This study presents a machine learning solution that employs autoencoder neural networks to detect aberrant response patterns in survey data as a computational method. We evaluated the effectiveness of autoencoder neural networks in identifying response anomalies through both simulated and real data. The results indicate that this approach can effectively detect anomalies in responses, providing researchers with more options for their analyses and subsequent conclusions. Ultimately, this enhances the trustworthiness of findings in psychological and educational research.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216261425242"},"PeriodicalIF":1.2,"publicationDate":"2026-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12904810/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146203553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Impact of Latent Density Misspecification on Item Response Theory Equating Methods.","authors":"Kyung Yong Kim, Seongeun Kim, Haeju Lee","doi":"10.1177/01466216261425440","DOIUrl":"10.1177/01466216261425440","url":null,"abstract":"<p><p>Item response theory (IRT) observed and true score equating are often conducted assuming that the latent variable is normally distributed. Although this might be a reasonable assumption for many educational and psychological assessments, not all variables can be approximated by a normal distribution. Under the common-item nonequivalent groups design, the current study examined the impact of latent density misspecification on IRT observed and true score equating. Specifically, equating results provided by two separate calibration estimates based on the Stocking-Lord linking method with normal and uniform weights and three concurrent calibration estimates obtained with different characterizations of the latent densities for the old and new groups were compared using both simulated and real data sets. In general, the concurrent calibration method with the latent densities for the two groups estimated using the empirical histogram method provided equating results with the least amount of error for most of the study conditions. Using normal weights with the Stocking-Lord method generally performed much better than using uniform weights; however, the overall performance of the Stocking-Lord method with normal weights was acceptable only if the latent densities for the two groups were normal distributions or close to normal distributions.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216261425440"},"PeriodicalIF":1.2,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12900660/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146203548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Score-Based Tests With Fixed Effects Person Parameters in Item Response Theory: Detecting Model Misspecification Including Differential Item Functioning.","authors":"Rudolf Debelak, Charles C Driver","doi":"10.1177/01466216261422480","DOIUrl":"10.1177/01466216261422480","url":null,"abstract":"<p><p>We present a fast, score-based test to detecting model misspecification in item response theory (IRT) models that remains valid when person parameters are treated as fixed effects, as may be used for very large data sets. The new approximation (i) eliminates the need to pre-specify ability groups or priors for person abilities, (ii) does not require explicit functional form assumptions, (iii) works with two estimators designed for very high item/person counts-constrained joint maximum likelihood (CJML) and joint maximum a posteriori (JMAP)-and (iv) requires only a single model fit, making DIF-screening faster and simpler than alternatives based on model comparisons. A spline-based residualization step further suppresses spurious Type I error when the ordering covariate is correlated with ability. Simulations with the two-parameter logistic model show nominal error rates and high power once examinees contribute around 15-20 responses; only extremely short tests (around 10 items) still pose challenges under strong impact. An application to 1,602 reading items and 57,684 students from the <i>Mindsteps</i> platform demonstrates scalability and practical value, flagging 13% of items for gender-related DIF and correlating highly with conventional approaches of explicitly modeling DIF. Together, these results position the proposed test as a robust, computation-light diagnostic for large-scale assessments when classical random-effects approaches are infeasible, ability group structure is unknown or complex, or the shape of DIF effects is unknown or complex.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216261422480"},"PeriodicalIF":1.2,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12890607/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146182799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonas Bjermo, Ellinor Fackle Fornius, Frank Miller
{"title":"Optimal Item Calibration in the Context of the Swedish Scholastic Aptitude Test.","authors":"Jonas Bjermo, Ellinor Fackle Fornius, Frank Miller","doi":"10.1177/01466216261420758","DOIUrl":"10.1177/01466216261420758","url":null,"abstract":"<p><p>Large-scale achievement tests require the existence of item banks with items for use in future tests. Before an item is included into the bank, its characteristics need to be estimated. The process of estimating the item characteristics is called item calibration. For the quality of the future achievement tests, it is important to perform this calibration well and it is desirable to estimate the item characteristics as efficiently as possible. Methods of optimal design have been developed to allocate pretest items to examinees with the most suited ability. Theoretical evidence shows advantages with using ability-dependent allocation of pretest items. However, it is not clear whether these theoretical results hold also in a real testing situation. In this paper, we investigate the performance of an optimal ability-dependent allocation in the context of the Swedish Scholastic Aptitude Test (SweSAT) and quantify the gain from using the optimal allocation. On average over all items, we see an improved precision of calibration. While this average improvement is moderate, we are able to identify for what kind of items the method works well. This enables targeting specific item types for optimal calibration. We also discuss possibilities for improvements of the method.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216261420758"},"PeriodicalIF":1.2,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12880929/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146144195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimating and Fitting the Non-continuous category scored Polytomous Items under the Weighted Score Logistic Model and its Simulation Study.","authors":"Xiaozhu Jian, Buyun Dai, Yeqi Qing, YuanPing Deng","doi":"10.1177/01466216261420305","DOIUrl":"10.1177/01466216261420305","url":null,"abstract":"<p><p>This study presents a novel extension of the weighted score logistic model (WSLM). The WSLM is an advancement of the traditional dichotomous logistic model that incorporates an additional weighted score parameter. This model is specifically designed to analyze non-continuous category scored polytomous items in educational and psychological testing contexts. Within the WSLM framework, the mean difficulty parameter reflects the overall item difficulty, while both discrimination and mean difficulty parameters are estimated using marginal maximum likelihood estimation. A Monte Carlo simulation study was conducted to evaluate the performance of the WSLM, which demonstrated low levels of bias and root mean square error (RMSE) of item parameters, indicative of accurate parameter recovery. Under most simulation conditions, the fit statistics Q1 and Q4 for polytomous items under the WSLM remained below their respective critical chi-square values, suggesting acceptable model-data fit. These results support the applicability and robustness of the WSLM in practical assessment settings involving complex scoring schemes.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216261420305"},"PeriodicalIF":1.2,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12854999/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146108014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}