Predictive Fit Metrics for Item Response Models.

IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL
Applied Psychological Measurement Pub Date : 2022-03-01 Epub Date: 2022-02-13 DOI:10.1177/01466216211066603
Benjamin A Stenhaug, Benjamin W Domingue
{"title":"Predictive Fit Metrics for Item Response Models.","authors":"Benjamin A Stenhaug, Benjamin W Domingue","doi":"10.1177/01466216211066603","DOIUrl":null,"url":null,"abstract":"<p><p>The fit of an item response model is typically conceptualized as whether a given model could have generated the data. In this study, for an alternative view of fit, \"predictive fit,\" based on the model's ability to predict new data is advocated. The authors define two prediction tasks: \"missing responses prediction\"-where the goal is to predict an in-sample person's response to an in-sample item-and \"missing persons prediction\"-where the goal is to predict an out-of-sample person's string of responses. Based on these prediction tasks, two predictive fit metrics are derived for item response models that assess how well an estimated item response model fits the data-generating model. These metrics are based on long-run out-of-sample predictive performance (i.e., if the data-generating model produced infinite amounts of data, what is the quality of a \"model's predictions on average?\"). Simulation studies are conducted to identify the prediction-maximizing model across a variety of conditions. For example, defining prediction in terms of missing responses, greater average person ability, and greater item discrimination are all associated with the 3PL model producing relatively worse predictions, and thus lead to greater minimum sample sizes for the 3PL model. In each simulation, the prediction-maximizing model to the model selected by Akaike's information criterion, Bayesian information criterion (BIC), and likelihood ratio tests are compared. It is found that performance of these methods depends on the prediction task of interest. In general, likelihood ratio tests often select overly flexible models, while BIC selects overly parsimonious models. The authors use Programme for International Student Assessment data to demonstrate how to use cross-validation to directly estimate the predictive fit metrics in practice. The implications for item response model selection in operational settings are discussed.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"46 2","pages":"136-155"},"PeriodicalIF":1.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908407/pdf/10.1177_01466216211066603.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/01466216211066603","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/2/13 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"PSYCHOLOGY, MATHEMATICAL","Score":null,"Total":0}
引用次数: 0

Abstract

The fit of an item response model is typically conceptualized as whether a given model could have generated the data. In this study, for an alternative view of fit, "predictive fit," based on the model's ability to predict new data is advocated. The authors define two prediction tasks: "missing responses prediction"-where the goal is to predict an in-sample person's response to an in-sample item-and "missing persons prediction"-where the goal is to predict an out-of-sample person's string of responses. Based on these prediction tasks, two predictive fit metrics are derived for item response models that assess how well an estimated item response model fits the data-generating model. These metrics are based on long-run out-of-sample predictive performance (i.e., if the data-generating model produced infinite amounts of data, what is the quality of a "model's predictions on average?"). Simulation studies are conducted to identify the prediction-maximizing model across a variety of conditions. For example, defining prediction in terms of missing responses, greater average person ability, and greater item discrimination are all associated with the 3PL model producing relatively worse predictions, and thus lead to greater minimum sample sizes for the 3PL model. In each simulation, the prediction-maximizing model to the model selected by Akaike's information criterion, Bayesian information criterion (BIC), and likelihood ratio tests are compared. It is found that performance of these methods depends on the prediction task of interest. In general, likelihood ratio tests often select overly flexible models, while BIC selects overly parsimonious models. The authors use Programme for International Student Assessment data to demonstrate how to use cross-validation to directly estimate the predictive fit metrics in practice. The implications for item response model selection in operational settings are discussed.

Abstract Image

项目反应模型的预测拟合度量。
项目反应模型的拟合度通常被理解为一个给定的模型是否能够生成数据。在本研究中,作者提出了另一种拟合度观点,即基于模型预测新数据能力的 "预测拟合度"。作者定义了两种预测任务:"缺失反应预测"--目标是预测样本内人员对样本内项目的反应;"缺失人员预测"--目标是预测样本外人员的一连串反应。基于这些预测任务,我们得出了两个项目反应模型的预测拟合度量,用于评估估计的项目反应模型与数据生成模型的拟合程度。这些指标基于长期的样本外预测性能(即如果数据生成模型产生了无限量的数据,那么 "模型的平均预测质量如何?)进行模拟研究是为了确定各种条件下的预测最大化模型。例如,根据缺失的回答、更高的平均个人能力和更高的项目区分度来定义预测,都会使 3PL 模型产生相对较差的预测结果,从而导致 3PL 模型的最小样本量增大。在每次模拟中,都会比较预测最大化模型与阿凯克信息准则、贝叶斯信息准则(BIC)和似然比检验所选择的模型。结果发现,这些方法的性能取决于所关注的预测任务。一般来说,似然比检验往往选择过于灵活的模型,而贝叶斯信息准则则选择过于简单的模型。作者利用国际学生评估项目的数据演示了如何在实践中使用交叉验证来直接估计预测拟合度量。作者还讨论了在实际操作中选择项目反应模型的意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.30
自引率
8.30%
发文量
50
期刊介绍: Applied Psychological Measurement publishes empirical research on the application of techniques of psychological measurement to substantive problems in all areas of psychology and related disciplines.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信