Scaling short-answer grading by combining peer assessment with algorithmic scoring

Proceedings of the first ACM conference on Learning @ scale conference Pub Date : 2014-03-04 DOI:10.1145/2556325.2566238

Chinmay Kulkarni, R. Socher, Michael S. Bernstein, Scott R. Klemmer

{"title":"Scaling short-answer grading by combining peer assessment with algorithmic scoring","authors":"Chinmay Kulkarni, R. Socher, Michael S. Bernstein, Scott R. Klemmer","doi":"10.1145/2556325.2566238","DOIUrl":null,"url":null,"abstract":"Peer assessment helps students reflect and exposes them to different ideas. It scales assessment and allows large online classes to use open-ended assignments. However, it requires students to spend significant time grading. How can we lower this grading burden while maintaining quality? This paper integrates peer and machine grading to preserve the robustness of peer assessment and lower grading burden. In the identify-verify pattern, a grading algorithm first predicts a student grade and estimates confidence, which is used to estimate the number of peer raters required. Peers then identify key features of the answer using a rubric. Finally, other peers verify whether these feature labels were accurately applied. This pattern adjusts the number of peers that evaluate an answer based on algorithmic confidence and peer agreement. We evaluated this pattern with 1370 students in a large, online design class. With only 54% of the student grading time, the identify-verify pattern yields 80-90% of the accuracy obtained by taking the median of three peer scores, and provides more detailed feedback. A second experiment found that verification dramatically improves accuracy with more raters, with a 20% gain over the peer-median with four raters. However, verification also leads to lower initial trust in the grading system. The identify-verify pattern provides an example of how peer work and machine learning can combine to improve the learning experience.","PeriodicalId":20830,"journal":{"name":"Proceedings of the first ACM conference on Learning @ scale conference","volume":"47 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2014-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"96","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the first ACM conference on Learning @ scale conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2556325.2566238","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 96

Abstract

Peer assessment helps students reflect and exposes them to different ideas. It scales assessment and allows large online classes to use open-ended assignments. However, it requires students to spend significant time grading. How can we lower this grading burden while maintaining quality? This paper integrates peer and machine grading to preserve the robustness of peer assessment and lower grading burden. In the identify-verify pattern, a grading algorithm first predicts a student grade and estimates confidence, which is used to estimate the number of peer raters required. Peers then identify key features of the answer using a rubric. Finally, other peers verify whether these feature labels were accurately applied. This pattern adjusts the number of peers that evaluate an answer based on algorithmic confidence and peer agreement. We evaluated this pattern with 1370 students in a large, online design class. With only 54% of the student grading time, the identify-verify pattern yields 80-90% of the accuracy obtained by taking the median of three peer scores, and provides more detailed feedback. A second experiment found that verification dramatically improves accuracy with more raters, with a 20% gain over the peer-median with four raters. However, verification also leads to lower initial trust in the grading system. The identify-verify pattern provides an example of how peer work and machine learning can combine to improve the learning experience.

查看原文本刊更多论文

通过将同行评估与算法评分相结合来扩展简答评分

同侪评估帮助学生反思，并让他们接触到不同的想法。它可以扩展评估，并允许大型在线课程使用开放式作业。然而，它需要学生花大量的时间来评分。我们如何在保持质量的同时降低这种评分负担?本文将同伴评分与机器评分相结合，保持了同伴评分的鲁棒性，降低了评分负担。在识别-验证模式中，评分算法首先预测学生的成绩并估计置信度，置信度用于估计所需的同伴评分者的数量。然后，对等体使用一个标题来识别答案的关键特征。最后，其他对等体验证这些特征标签是否被准确应用。此模式根据算法置信度和对等体协议调整评估答案的对等体数量。我们在一个大型在线设计课程中对1370名学生进行了评估。在学生评分时间仅为54%的情况下，识别-验证模式的准确率达到了取三个同伴分数中位数的80-90%，并提供了更详细的反馈。第二个实验发现，当评分者更多时，验证会显著提高准确率，比4个评分者的中位数高出20%。然而，验证也会导致对评分系统的初始信任度降低。识别-验证模式提供了一个例子，说明如何结合对等工作和机器学习来改善学习体验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the first ACM conference on Learning @ scale conference

自引率

0.00%

发文量