{"title":"Outlier Detection Using t-test in Rasch IRT Equating under NEAT Design.","authors":"Chunyan Liu, Daniel Jurich","doi":"10.1177/01466216221124045","DOIUrl":null,"url":null,"abstract":"<p><p>In equating practice, the existence of outliers in the anchor items may deteriorate the equating accuracy and threaten the validity of test scores. Therefore, stability of the anchor item performance should be evaluated before conducting equating. This study used simulation to investigate the performance of the <i>t</i>-test method in detecting outliers and compared its performance with other outlier detection methods, including the logit difference method with 0.5 and 0.3 as the cutoff values and the robust <i>z</i> statistic with 2.7 as the cutoff value. The investigated factors included sample size, proportion of outliers, item difficulty drift direction, and group difference. Across all simulated conditions, the <i>t</i>-test method outperformed the other methods in terms of sensitivity of flagging true outliers, bias of the estimated translation constant, and the root mean square error of examinee ability estimates.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9679927/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/01466216221124045","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/9/6 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"PSYCHOLOGY, MATHEMATICAL","Score":null,"Total":0}
引用次数: 0
Abstract
In equating practice, the existence of outliers in the anchor items may deteriorate the equating accuracy and threaten the validity of test scores. Therefore, stability of the anchor item performance should be evaluated before conducting equating. This study used simulation to investigate the performance of the t-test method in detecting outliers and compared its performance with other outlier detection methods, including the logit difference method with 0.5 and 0.3 as the cutoff values and the robust z statistic with 2.7 as the cutoff value. The investigated factors included sample size, proportion of outliers, item difficulty drift direction, and group difference. Across all simulated conditions, the t-test method outperformed the other methods in terms of sensitivity of flagging true outliers, bias of the estimated translation constant, and the root mean square error of examinee ability estimates.
期刊介绍:
Applied Psychological Measurement publishes empirical research on the application of techniques of psychological measurement to substantive problems in all areas of psychology and related disciplines.