Accounting for item calibration error in computerized adaptive testing.

IF 4.6 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL
Aron Fink, Christoph König, Andreas Frey
{"title":"Accounting for item calibration error in computerized adaptive testing.","authors":"Aron Fink, Christoph König, Andreas Frey","doi":"10.3758/s13428-025-02649-8","DOIUrl":null,"url":null,"abstract":"<p><p>In computerized adaptive testing (CAT), item parameter estimates derived from calibration studies are considered to be known and are used as fixed values for adaptive item selection and ability estimation. This is not completely accurate because these item parameter estimates contain a certain degree of error. If this error is random, the typical CAT procedure leads to standard errors of the final ability estimates that are too small. If the calibration error is large, it has been shown that the accuracy of the ability estimates is negatively affected due to the capitalization on chance problem, especially for extreme ability levels. In order to find a solution for this fundamental problem of CAT, we conducted a Monte Carlo simulation study to examine three approaches that can be used to consider the uncertainty of item parameter estimates in CAT. The first two approaches used a measurement error modeling approach in which item parameters were treated as covariates that contained errors. The third approach was fully Bayesian. Each of the approaches was compared with regard to the quality of the resulting ability estimates. The results indicate that each of the three approaches is capable of reducing bias and the mean squared error (MSE) of the ability estimates, especially for high item calibration errors. The Bayesian approach clearly outperformed the other approaches. We recommend the Bayesian approach, especially for application areas in which the recruitment of a large calibration sample is infeasible.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 5","pages":"126"},"PeriodicalIF":4.6000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11947018/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior Research Methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3758/s13428-025-02649-8","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

In computerized adaptive testing (CAT), item parameter estimates derived from calibration studies are considered to be known and are used as fixed values for adaptive item selection and ability estimation. This is not completely accurate because these item parameter estimates contain a certain degree of error. If this error is random, the typical CAT procedure leads to standard errors of the final ability estimates that are too small. If the calibration error is large, it has been shown that the accuracy of the ability estimates is negatively affected due to the capitalization on chance problem, especially for extreme ability levels. In order to find a solution for this fundamental problem of CAT, we conducted a Monte Carlo simulation study to examine three approaches that can be used to consider the uncertainty of item parameter estimates in CAT. The first two approaches used a measurement error modeling approach in which item parameters were treated as covariates that contained errors. The third approach was fully Bayesian. Each of the approaches was compared with regard to the quality of the resulting ability estimates. The results indicate that each of the three approaches is capable of reducing bias and the mean squared error (MSE) of the ability estimates, especially for high item calibration errors. The Bayesian approach clearly outperformed the other approaches. We recommend the Bayesian approach, especially for application areas in which the recruitment of a large calibration sample is infeasible.

计算机自适应测试中项目校准误差的解释。
在计算机化自适应测试(CAT)中,从校准研究中得出的项目参数估计值被认为是已知的,并被用作自适应项目选择和能力估计的固定值。这不是完全准确的,因为这些项目参数估计包含一定程度的误差。如果该误差是随机的,则典型的CAT过程会导致最终能力估计的标准误差过小。在校正误差较大的情况下,由于机会上的资本化问题,能力估计的准确性受到负面影响,特别是对于极端能力水平。为了找到CAT这个基本问题的解决方案,我们进行了蒙特卡罗模拟研究,以检查可用于考虑CAT中项目参数估计的不确定性的三种方法。前两种方法使用测量误差建模方法,其中项目参数被视为包含误差的协变量。第三种方法是完全贝叶斯的。每一种方法都比较了结果能力估计的质量。结果表明,三种方法均能减小能力估计的偏倚和均方误差(MSE),特别是在高项目校准误差的情况下。贝叶斯方法显然优于其他方法。我们推荐贝叶斯方法,特别是在应用领域,其中招募一个大的校准样本是不可行的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
10.30
自引率
9.30%
发文量
266
期刊介绍: Behavior Research Methods publishes articles concerned with the methods, techniques, and instrumentation of research in experimental psychology. The journal focuses particularly on the use of computer technology in psychological research. An annual special issue is devoted to this field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信