Rose E Stafford, Edward W Wolfe, Jodi M Casablanca, Tian Song
{"title":"在不同缺失程度的评分设计下检测评分者效应。","authors":"Rose E Stafford, Edward W Wolfe, Jodi M Casablanca, Tian Song","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Previous research has shown that indices obtained from partial credit model (PCM) estimates can detect severity and centrality rater effects, though it remains unknown how rater effect detection is impacted by the missingness inherent in double-scoring rating designs. This simulation study evaluated the impact of missing data on rater severity and centrality detection. Data were generated for each rater effect type, which varied in rater pool quality, rater effect prevalence and magnitude, and extent of missingness. Raters were flagged using rater location as a severity indicator and the standard deviation of rater thresholds a centrality indicator. Two methods of identifying extreme scores on these indices were compared. Results indicate that both methods result in low Type I and Type II error rates (i.e., incorrectly flagging non-effect raters and not flagging effect raters) and that the presence of missing data has negligible impact on the detection of severe and central raters.</p>","PeriodicalId":73608,"journal":{"name":"Journal of applied measurement","volume":"19 3","pages":"243-257"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detecting Rater Effects under Rating Designs with Varying Levels of Missingness.\",\"authors\":\"Rose E Stafford, Edward W Wolfe, Jodi M Casablanca, Tian Song\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Previous research has shown that indices obtained from partial credit model (PCM) estimates can detect severity and centrality rater effects, though it remains unknown how rater effect detection is impacted by the missingness inherent in double-scoring rating designs. This simulation study evaluated the impact of missing data on rater severity and centrality detection. Data were generated for each rater effect type, which varied in rater pool quality, rater effect prevalence and magnitude, and extent of missingness. Raters were flagged using rater location as a severity indicator and the standard deviation of rater thresholds a centrality indicator. Two methods of identifying extreme scores on these indices were compared. Results indicate that both methods result in low Type I and Type II error rates (i.e., incorrectly flagging non-effect raters and not flagging effect raters) and that the presence of missing data has negligible impact on the detection of severe and central raters.</p>\",\"PeriodicalId\":73608,\"journal\":{\"name\":\"Journal of applied measurement\",\"volume\":\"19 3\",\"pages\":\"243-257\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of applied measurement\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of applied measurement","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Detecting Rater Effects under Rating Designs with Varying Levels of Missingness.
Previous research has shown that indices obtained from partial credit model (PCM) estimates can detect severity and centrality rater effects, though it remains unknown how rater effect detection is impacted by the missingness inherent in double-scoring rating designs. This simulation study evaluated the impact of missing data on rater severity and centrality detection. Data were generated for each rater effect type, which varied in rater pool quality, rater effect prevalence and magnitude, and extent of missingness. Raters were flagged using rater location as a severity indicator and the standard deviation of rater thresholds a centrality indicator. Two methods of identifying extreme scores on these indices were compared. Results indicate that both methods result in low Type I and Type II error rates (i.e., incorrectly flagging non-effect raters and not flagging effect raters) and that the presence of missing data has negligible impact on the detection of severe and central raters.