Educational and Psychological Measurement最新文献

筛选
英文 中文
Assessing the Unconditional and Conditional External Validity of Noncognitive Test Scores: A Unifying Model-Based Proposal. 评估非认知测验成绩的无条件和条件外部效度:一个基于统一模型的建议。
IF 2.3 3区 心理学
Educational and Psychological Measurement Pub Date : 2026-04-30 DOI: 10.1177/00131644261440168
Pere J Ferrando, Fabia Morales-Vives, Silvia Duran-Bonavila, David Navarro-González
{"title":"Assessing the Unconditional and Conditional External Validity of Noncognitive Test Scores: A Unifying Model-Based Proposal.","authors":"Pere J Ferrando, Fabia Morales-Vives, Silvia Duran-Bonavila, David Navarro-González","doi":"10.1177/00131644261440168","DOIUrl":"https://doi.org/10.1177/00131644261440168","url":null,"abstract":"<p><p>Evidence of external validity based on individual score estimates is still relevant in many psychometric applications. From a model-based perspective, however, the topic appears to have been rather neglected in recent decades. Thus, in structural equation modelling (SEM), this evidence is sought to be obtained structurally, bypassing the scoring stage. And, in item response theory (IRT), the score interest mostly focuses on internal properties. Taking this state of affairs into account, this paper develops and proposes a model-based approach, intended for noncognitive measures, that combines SEM and IRT developments, and which allows a detailed assessment of the external validity of a class of score estimates to be carried out. The starting point is a general extended model that also includes the relevant external variables. From this general model, four well-known extended IRT models can be derived and fitted at the structural level. Next, on the basis of the structural results, a series of unconditional (population-dependent) and conditional (population-independent) indices that describe the model-implied relation between the score estimates and each external variable are developed and proposed. The practical relevance of the proposal is discussed mainly around three applications: assessing model appropriateness, obtaining point and interval prediction estimates at the individual level, and shortening a test while optimizing the external validity of the resulting version. The functioning of the proposal is illustrated using a real-data example.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644261440168"},"PeriodicalIF":2.3,"publicationDate":"2026-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13133035/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147812434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiplicity Control for Structural Equation Modeling in lavaan: A Practical Workflow for False Discovery Rate Adjustment. lavaan结构方程建模的多重控制:一种调整错误发现率的实用工作流程。
IF 2.3 3区 心理学
Educational and Psychological Measurement Pub Date : 2026-04-28 DOI: 10.1177/00131644261442141
Giuseppe Corbelli
{"title":"Multiplicity Control for Structural Equation Modeling in <i>lavaan</i>: A Practical Workflow for False Discovery Rate Adjustment.","authors":"Giuseppe Corbelli","doi":"10.1177/00131644261442141","DOIUrl":"https://doi.org/10.1177/00131644261442141","url":null,"abstract":"<p><p>Structural equation modeling (SEM) is widely used in educational and behavioral research, but applied SEM often involves simultaneous tests of many structural paths. When many coefficients are evaluated at nominal thresholds, the probability of false positives and the expected number of false discoveries can be substantial even when global fit indices indicate close fit, encouraging substantive interpretation of chance findings. Building on prior work on multiplicity control in SEM, this article presents a practical workflow for false discovery rate (FDR) adjustment of families of SEM parameter tests obtained from fitted <i>lavaan</i> model objects, including the dependence-robust Benjamini-Yekutieli (BY) procedure, and provides an R implementation to support routine use. In a Monte Carlo study (1,000 replications; <i>N</i> = 500) with nine latent factors, a correctly specified measurement model, and an overspecified structural model with 33 candidate regressions (8 non-zero), nominal <i>p</i> < .05 produced at least one false positive in 69.3% of samples and a mean of 1.182 false-positive paths. BY adjustment reduced the mean number of false positives to 0.073, while the mean number of detected true effects declined from 6.358 to 5.857. A sensitivity analysis across three dependency conditions indicated that BY-FDR was more robust to the direction and magnitude of parameter dependence, whereas BH's false-positive control weakened under negative dependence. These results suggest that dependence-robust FDR adjustment can be integrated into a standard SEM workflow with <i>lavaan</i> in R, and may substantially reduce false positives with a modest reduction in detected true effects.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644261442141"},"PeriodicalIF":2.3,"publicationDate":"2026-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13124904/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147812479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Controlling the False Discovery Rate in DIF Detection With e-Values: Evidence From Multidimensional and Testlet Simulations. 用e值控制DIF检测中的错误发现率:来自多维和测试模拟的证据。
IF 2.3 3区 心理学
Educational and Psychological Measurement Pub Date : 2026-04-16 DOI: 10.1177/00131644261433236
Shan Huang, David Goretzko
{"title":"Controlling the False Discovery Rate in DIF Detection With e-Values: Evidence From Multidimensional and Testlet Simulations.","authors":"Shan Huang, David Goretzko","doi":"10.1177/00131644261433236","DOIUrl":"10.1177/00131644261433236","url":null,"abstract":"<p><p>This study presents the first application of e-value-based false discovery rate (FDR) control to Differential Item Functioning (DIF) detection, addressing long-standing limitations of <i>p</i>-value-based approaches when model assumptions are violated-for example, under multidimensionality, local item dependence, or extreme sample sizes. Two comprehensive simulation studies were conducted to evaluate e-BH (the e-value analogue of BH) procedures, using K-fold and Multisplit likelihood-ratio e-values, under (a) multidimensional contamination and (b) testlet-based local dependence. Across both scenarios, e-BH consistently provided stronger and more stable control of Type I error, FDR, and family-wise error rate (FWER) than classical procedures such as Benjamini-Hochberg (BH) and Holm. Even under severe model misspecification, e-BH maintained substantially lower false-positive rates while remaining relatively competitive in terms of Type II error. A key finding concerns sample size: classical <i>p</i>-value methods exhibited inflation of Type I error as N increased, whereas e-BH preserved stable error control due to its model-agnostic calibration. An empirical application using Progress in International Reading Literacy Study (PIRLS) data further demonstrated that e-BH produces a more defensible and operationally sustainable set of DIF flags than traditional approaches. Together, these results establish e-values as a powerful and robust evidential tool for DIF detection in modern assessment contexts.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644261433236"},"PeriodicalIF":2.3,"publicationDate":"2026-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13086766/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147722237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating Ensemble Clustering and Text Embeddings for Estimating the Factor Loadings of Self-Report Scales. 基于集成聚类和文本嵌入的自报告量表因子负荷估计。
IF 2.3 3区 心理学
Educational and Psychological Measurement Pub Date : 2026-04-12 DOI: 10.1177/00131644261430762
Nathaniel M Voss, Felix Y Wu, Anoop A Javalagi, Harrison J Kell
{"title":"Integrating Ensemble Clustering and Text Embeddings for Estimating the Factor Loadings of Self-Report Scales.","authors":"Nathaniel M Voss, Felix Y Wu, Anoop A Javalagi, Harrison J Kell","doi":"10.1177/00131644261430762","DOIUrl":"https://doi.org/10.1177/00131644261430762","url":null,"abstract":"<p><p>Advances in large language models can provide opportunities to evaluate the characteristics of scales prior to data collection. In this study, we explore if item text can be used to predict a scale's psychometric properties. Specifically, we examine if clustering consensus (i.e., the frequency by which items are grouped with other items from the same underlying factor across multiple clustering algorithms), and a cosine similarity metric (i.e., the semantic similarity of items to other items from the same factor), can be used to predict exploratory factor analysis (EFA) factor loadings. Across six scales with varying sample sizes, number of factors/items, we found that both the cosine similarity and ensemble clustering consensus methods predicted factor loading values. While the methods share some conceptual and empirical overlap, and results vary by scale, the ensemble clustering approach explains incremental variance above and beyond cosine similarity in predicting factor loadings. Using both methods in conjunction can be a useful way to identify problematic items prior to data collection and help researchers develop more optimal scales from the onset, thereby potentially saving time, resources, and increasing the likelihood of developing sound measures.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644261430762"},"PeriodicalIF":2.3,"publicationDate":"2026-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13076461/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147688791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Extreme Is It Anyways?: An Empirical Investigation Into the Prevalence and Strength of Extreme Response Style. 到底有多极端?极端反应风格的流行程度和强度的实证研究。
IF 2.3 3区 心理学
Educational and Psychological Measurement Pub Date : 2026-04-09 DOI: 10.1177/00131644261435119
Martijn Schoenmakers, Jesper Tijmstra, Jeroen Kornelis Vermunt, Maria Bolsinova
{"title":"How Extreme Is It Anyways?: An Empirical Investigation Into the Prevalence and Strength of Extreme Response Style.","authors":"Martijn Schoenmakers, Jesper Tijmstra, Jeroen Kornelis Vermunt, Maria Bolsinova","doi":"10.1177/00131644261435119","DOIUrl":"https://doi.org/10.1177/00131644261435119","url":null,"abstract":"<p><p>Extreme response style (ERS), the tendency of participants to endorse the extreme categories of an item partially independent of item content, has repeatedly been found to decrease the validity of Likert-type scale results. For this reason, many IRT models have been developed that attempt to detect and correct for ERS. Despite the substantive literature on ERS and modeling of ERS, several important questions remain. To date, there is no clear estimate of how often ERS occurs in practice across a variety of scales and populations. In addition, there is little guidance on what item parameters for ERS models are commonly found in empirical data, while this information is crucial to inform future methodological studies utilizing ERS models. Finally, there is only limited information available on which ERS models tend to fit the data best. The current study sets out to address these three issues by analyzing data from the Programme for International Student Assessment using a generalized partial credit model, several multidimensional nominal response models, and several IRTree models. Results indicate an extremely high prevalence of ERS across scales, populations, and timepoints. Item parameters for future methodological studies are presented, and a general preference for IRTree models over MNRM models is found in many datasets. Implications for futures studies are discussed, and recommendations for practice are made.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644261435119"},"PeriodicalIF":2.3,"publicationDate":"2026-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13068779/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147671374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Faking in High-Stakes Personality Assessments: A Response-Time-Based Latent Response Mixture Modeling Approach. 高风险人格评估中的作假:一种基于反应时间的潜在反应混合建模方法。
IF 2.3 3区 心理学
Educational and Psychological Measurement Pub Date : 2026-03-18 DOI: 10.1177/00131644261422169
Timo Seitz, Esther Ulitzsch
{"title":"Faking in High-Stakes Personality Assessments: A Response-Time-Based Latent Response Mixture Modeling Approach.","authors":"Timo Seitz, Esther Ulitzsch","doi":"10.1177/00131644261422169","DOIUrl":"10.1177/00131644261422169","url":null,"abstract":"<p><p>When personality assessments are employed in high-stakes contexts, there is the risk that test-takers provide overly positive descriptions of themselves. This response bias is known as faking and has often been addressed in latent variable models through an additional dimension capturing each test-taker's faking degree. Such models typically assume a homogeneous response strategy for all test-takers, with substantive traits and faking jointly influencing responses to all items. In this article, we present a latent response mixture item response theory (IRT) model of faking that accounts for changes in test-takers' response strategies over the course of the assessment. The model translates theoretical considerations about test-taker behavior into different model components for item responses and corresponding item-level response times (RT), thereby allowing to account for, identify, and investigate different faking-related response strategies on the person-by-item level. In a parameter recovery study, we found that the model parameters can be estimated well under realistic conditions. Also, we applied the model to an empirical dataset (<i>N</i> = 1,824) from a job application context, showcasing its utility in real high-stakes assessment data. We conclude the article by discussing the role of the model for psychological measurement as well as substantive research.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644261422169"},"PeriodicalIF":2.3,"publicationDate":"2026-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12999537/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147497889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conditional Dependencies Between Response Time and Item Discrimination: An Item-Level Meta-Analysis. 反应时间与项目辨别力的条件依赖关系:一个项目水平的元分析。
IF 2.3 3区 心理学
Educational and Psychological Measurement Pub Date : 2026-03-17 DOI: 10.1177/00131644261426972
Joshua B Gilbert, William S Young, Zachary Himmelsbach, Esther Ulitzsch, Benjamin W Domingue
{"title":"Conditional Dependencies Between Response Time and Item Discrimination: An Item-Level Meta-Analysis.","authors":"Joshua B Gilbert, William S Young, Zachary Himmelsbach, Esther Ulitzsch, Benjamin W Domingue","doi":"10.1177/00131644261426972","DOIUrl":"10.1177/00131644261426972","url":null,"abstract":"<p><p>The use of process data, such as response time (RT) in psychometrics, has generally focused on the relationship between speed and accuracy. The potential relationships between RT and item discrimination remain less explored. In this study, we propose a model for simultaneously estimating the relationships between RT and item discrimination at the person, item, and person-by-item (residual) levels and illustrate our approach through an item-level meta-analysis of 40 empirical data sets comprising 1.84 million item responses. We find no evidence of average differences in item discrimination between items of different time intensity or persons of different average RT, while residual RT strongly and negatively predicts item discrimination (pooled coef. = -.27% per 1% difference in RT, <i>SE</i> = .04, <math><mrow><mi>τ</mi></mrow> </math> = .17). While heterogeneity is high, we find little evidence of moderation by overall data set characteristics. Flexible generalized additive models show that the relationship between residual RT and item discrimination is generally curvilinear, with discrimination maximized just below average RT and minimized at the extremes. Our results suggest that RT data can provide insights into the measurement properties of educational and psychological assessments, but that the relationships between RT and item discrimination are highly variable.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644261426972"},"PeriodicalIF":2.3,"publicationDate":"2026-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12995739/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147485016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Trends With Differential Item Functioning: A Comparison of Five IRT-Based Approaches. 用差异项目功能估计趋势:五种基于红外光谱的方法的比较。
IF 2.3 3区 心理学
Educational and Psychological Measurement Pub Date : 2026-03-13 DOI: 10.1177/00131644251408818
Oskar Engels, Oliver Lüdtke, Alexander Robitzsch
{"title":"Estimating Trends With Differential Item Functioning: A Comparison of Five IRT-Based Approaches.","authors":"Oskar Engels, Oliver Lüdtke, Alexander Robitzsch","doi":"10.1177/00131644251408818","DOIUrl":"https://doi.org/10.1177/00131644251408818","url":null,"abstract":"<p><p>In longitudinal assessments, tests are frequently used to estimate trends over time. However, when item parameters lack invariance, time-point comparisons can be distorted, necessitating appropriate statistical methods to achieve accurate estimation. This study compares trend estimates using the two-parameter logistic (2PL) model under item parameter drift (IPD) across five trend-estimation approaches for two time points: First, concurrent calibration, which jointly estimates item parameters across multiple time points. Second, fixed calibration, which estimates item parameters at a single time point and fixes them at the other time point. Third, robust linking with Haberman and Haebara as linking methods with <math> <mrow> <msub><mrow><mi>L</mi></mrow> <mrow><mi>p</mi></mrow> </msub> </mrow> </math> or <math> <mrow> <msub><mrow><mi>L</mi></mrow> <mrow><mn>0</mn></mrow> </msub> </mrow> </math> losses. Fourth, non-invariant items are detected using likelihood-ratio tests or the root mean square deviation statistic with fixed or data-driven cutoffs, and trend estimates are then recomputed using only the detected invariant items under partial invariance. Fifth, regularized estimation under a smooth Bayesian information criterion (SBIC) is applied, shrinking small or null IPD effects toward zero while estimating all others as nonzero. Bias and relative root mean square error (RMSE) were evaluated for the mean and <i>SD</i> at T2. An empirical example using synthetic longitudinal reading data, applying the trend-estimation approaches, is provided. The results indicate that the regularized estimation with SBIC performed best across conditions, maintaining low bias and RMSE, followed by robust linking methods. Specifically, Haberman linking with the <math> <mrow> <msub><mrow><mi>L</mi></mrow> <mrow><mn>0</mn></mrow> </msub> </mrow> </math> loss function showed superior performance under unbalanced IPD, outperforming the partial invariance approaches. Concurrent and fixed calibration showed the poorest trend recovery under unbalanced IPD conditions.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644251408818"},"PeriodicalIF":2.3,"publicationDate":"2026-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12987755/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147462744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discriminating Between Attribute, Item-Position, and Wording Effects by the Congeneric and Tau-Equivalent Confirmatory Factor Analysis Models. 用同类和等效验证性因子分析模型区分属性、项目位置和措辞效应。
IF 2.3 3区 心理学
Educational and Psychological Measurement Pub Date : 2026-03-11 DOI: 10.1177/00131644261419028
Karl Schweizer, Xuezhu Ren, Tengfei Wang
{"title":"Discriminating Between Attribute, Item-Position, and Wording Effects by the Congeneric and Tau-Equivalent Confirmatory Factor Analysis Models.","authors":"Karl Schweizer, Xuezhu Ren, Tengfei Wang","doi":"10.1177/00131644261419028","DOIUrl":"https://doi.org/10.1177/00131644261419028","url":null,"abstract":"<p><p>The capability of confirmatory factor analysis to discriminate common systematic variation of attribute, item-position, and wording effects was investigated using the congeneric and tau-equivalent models. The simulated data generated according to four approaches included gradually increased amounts of item-position or wording effect variation while the amount of attribute variation was kept constant. The congeneric model always signified good model fit independently of the type and amount of additional common systematic variation, that is, there was no discrimination. In applications of the tau-equivalent model, the increase of the item-position or wording effect variation led to the change from indicating good fit to bad model fit, that is, there was negative discrimination. In contrast, the additionally considered two-factor tau model discriminated positively. As a consequence of these results, we recommend the pre-screening of data for method effects.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644261419028"},"PeriodicalIF":2.3,"publicationDate":"2026-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12979218/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147462748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of Conditional Standard Errors of Measurement for MLE Scores in MST. MST中MLE分数的条件标准误差估计。
IF 2.3 3区 心理学
Educational and Psychological Measurement Pub Date : 2026-02-25 DOI: 10.1177/00131644261420391
Yuanyuan J Stirn, Won-Chan Lee
{"title":"Estimation of Conditional Standard Errors of Measurement for MLE Scores in MST.","authors":"Yuanyuan J Stirn, Won-Chan Lee","doi":"10.1177/00131644261420391","DOIUrl":"10.1177/00131644261420391","url":null,"abstract":"<p><p>This paper proposes an information-based analytic method for calculating the conditional standard error of measurement (CSEM) in multistage testing (MST) using maximum likelihood estimation. The accuracy of the proposed method was evaluated by comparing CSEMs computed using the analytic method with those obtained from simulation across the same four MST designs. The results show that analytic and simulation-based CSEMs converge as test length increases, indicating that the proposed method provides a reliable approximation for longer tests. However, shorter tests and more complex MST designs require additional items to achieve comparable accuracy. The study also compared the proposed method with Park et al.'s analytic approach. Practical implications of the proposed method are discussed.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644261420391"},"PeriodicalIF":2.3,"publicationDate":"2026-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12945742/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147324610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书