{"title":"Modified Item-Fit Indices for Dichotomous IRT Models with Missing Data.","authors":"Xue Zhang, Chun Wang","doi":"10.1177/01466216221125176","DOIUrl":null,"url":null,"abstract":"<p><p>Item-level fit analysis not only serves as a complementary check to global fit analysis, it is also essential in scale development because the fit results will guide item revision and/or deletion (Liu & Maydeu-Olivares, 2014). During data collection, missing response data may likely happen due to various reasons. Chi-square-based item fit indices (e.g., Yen's <i>Q</i> <sub><i>1</i></sub> , McKinley and Mill's <i>G</i> <sup><i>2</i></sup> , Orlando and Thissen's <i>S-X</i> <sup><i>2</i></sup> and <i>S-G</i> <sup><i>2</i></sup> ) are the most widely used statistics to assess item-level fit. However, the role of total scores with complete data used in <i>S-X</i> <sup><i>2</i></sup> and <i>S-G</i> <sup><i>2</i></sup> is different from that with incomplete data. As a result, <i>S-X</i> <sup><i>2</i></sup> and <i>S-G</i> <sup><i>2</i></sup> cannot handle incomplete data directly. To this end, we propose several modified versions of <i>S-X</i> <sup><i>2</i></sup> and <i>S-G</i> <sup><i>2</i></sup> to evaluate item-level fit when response data are incomplete, named as <i>M</i> <sub><i>impute</i></sub> <i>-X</i> <sup><i>2</i></sup> and <i>M</i> <sub><i>impute</i></sub> <i>-G</i> <sup><i>2</i></sup> , of which the subscript \"<i>impute</i>\" denotes different imputation methods. Instead of using observed total scores for grouping, the new indices rely on imputed total scores by either a single imputation method or three multiple imputation methods (i.e., two-way with normally distributed errors, corrected item-mean substitution with normally distributed errors and response function imputation). The new indices are equivalent to <i>S-X</i> <sup><i>2</i></sup> and <i>S-G</i> <sup><i>2</i></sup> when response data are complete. Their performances are evaluated and compared via simulation studies; the manipulated factors include test length, sources of misfit, misfit proportion, and missing proportion. The results from simulation studies are consistent with those of Orlando and Thissen (2000, 2003), and different indices are recommended under different conditions.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9574083/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/01466216221125176","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/9/19 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"PSYCHOLOGY, MATHEMATICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Item-level fit analysis not only serves as a complementary check to global fit analysis, it is also essential in scale development because the fit results will guide item revision and/or deletion (Liu & Maydeu-Olivares, 2014). During data collection, missing response data may likely happen due to various reasons. Chi-square-based item fit indices (e.g., Yen's Q1 , McKinley and Mill's G2 , Orlando and Thissen's S-X2 and S-G2 ) are the most widely used statistics to assess item-level fit. However, the role of total scores with complete data used in S-X2 and S-G2 is different from that with incomplete data. As a result, S-X2 and S-G2 cannot handle incomplete data directly. To this end, we propose several modified versions of S-X2 and S-G2 to evaluate item-level fit when response data are incomplete, named as Mimpute-X2 and Mimpute-G2 , of which the subscript "impute" denotes different imputation methods. Instead of using observed total scores for grouping, the new indices rely on imputed total scores by either a single imputation method or three multiple imputation methods (i.e., two-way with normally distributed errors, corrected item-mean substitution with normally distributed errors and response function imputation). The new indices are equivalent to S-X2 and S-G2 when response data are complete. Their performances are evaluated and compared via simulation studies; the manipulated factors include test length, sources of misfit, misfit proportion, and missing proportion. The results from simulation studies are consistent with those of Orlando and Thissen (2000, 2003), and different indices are recommended under different conditions.
期刊介绍:
Applied Psychological Measurement publishes empirical research on the application of techniques of psychological measurement to substantive problems in all areas of psychology and related disciplines.