{"title":"用多面Rasch评定量表测量模型考察音乐表演评定中评价者的判断。","authors":"Pey Shin Ooi, George Engelhard","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>The fairness of raters in music performance assessment has become an important concern in the field of music. The assessment of students' music performance depends in a fundamental way on rater judgements. The quality of rater judgements is crucial to provide fair, meaningful and informative assessments of music performance. There are many external factors that can influence the quality of rater judgements. Previous research has used different measurement models to examine the quality of rater judgements (e.g., generalizability theory). There are limitations with the previous analysis methods that are based on classical test theory and its extensions. In this study, we use modern measurement theory (Rasch measurement theory) to examine the quality of rater judgements. The many-facets Rasch rating scale model is employed to investigate the extent of rater-invariant measurement in the context of music performance assessments related to university degrees in Malaysia (159 students rated by 24 raters). We examine the rating scale structure, the severity levels of the raters, and the judged difficulty of the items. We also examine the interaction effects across musical instrument subgroups (keyboard, strings, woodwinds, brass, percussions and vocal). The results suggest that there were differences in severity levels among the raters. The results of this study also suggest that raters had different severity levels when rating different musical instrument subgroups. The implications for research, theory and practice in the assessment of music performance are included in this paper.</p>","PeriodicalId":73608,"journal":{"name":"Journal of applied measurement","volume":"20 1","pages":"79-99"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Examining Rater Judgements in Music Performance Assessment using Many-Facets Rasch Rating Scale Measurement Model.\",\"authors\":\"Pey Shin Ooi, George Engelhard\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The fairness of raters in music performance assessment has become an important concern in the field of music. The assessment of students' music performance depends in a fundamental way on rater judgements. The quality of rater judgements is crucial to provide fair, meaningful and informative assessments of music performance. There are many external factors that can influence the quality of rater judgements. Previous research has used different measurement models to examine the quality of rater judgements (e.g., generalizability theory). There are limitations with the previous analysis methods that are based on classical test theory and its extensions. In this study, we use modern measurement theory (Rasch measurement theory) to examine the quality of rater judgements. The many-facets Rasch rating scale model is employed to investigate the extent of rater-invariant measurement in the context of music performance assessments related to university degrees in Malaysia (159 students rated by 24 raters). We examine the rating scale structure, the severity levels of the raters, and the judged difficulty of the items. We also examine the interaction effects across musical instrument subgroups (keyboard, strings, woodwinds, brass, percussions and vocal). The results suggest that there were differences in severity levels among the raters. The results of this study also suggest that raters had different severity levels when rating different musical instrument subgroups. The implications for research, theory and practice in the assessment of music performance are included in this paper.</p>\",\"PeriodicalId\":73608,\"journal\":{\"name\":\"Journal of applied measurement\",\"volume\":\"20 1\",\"pages\":\"79-99\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of applied measurement\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of applied measurement","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Examining Rater Judgements in Music Performance Assessment using Many-Facets Rasch Rating Scale Measurement Model.
The fairness of raters in music performance assessment has become an important concern in the field of music. The assessment of students' music performance depends in a fundamental way on rater judgements. The quality of rater judgements is crucial to provide fair, meaningful and informative assessments of music performance. There are many external factors that can influence the quality of rater judgements. Previous research has used different measurement models to examine the quality of rater judgements (e.g., generalizability theory). There are limitations with the previous analysis methods that are based on classical test theory and its extensions. In this study, we use modern measurement theory (Rasch measurement theory) to examine the quality of rater judgements. The many-facets Rasch rating scale model is employed to investigate the extent of rater-invariant measurement in the context of music performance assessments related to university degrees in Malaysia (159 students rated by 24 raters). We examine the rating scale structure, the severity levels of the raters, and the judged difficulty of the items. We also examine the interaction effects across musical instrument subgroups (keyboard, strings, woodwinds, brass, percussions and vocal). The results suggest that there were differences in severity levels among the raters. The results of this study also suggest that raters had different severity levels when rating different musical instrument subgroups. The implications for research, theory and practice in the assessment of music performance are included in this paper.