计算机科学对评分系统自动化挑战的调查与分析

IF 28 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys Pub Date : 2025-07-14 DOI:10.1145/3748521

Joan Lu, Bhavya Krishna Balasubramanian, Mike Joy, Qiang Xu

{"title":"计算机科学对评分系统自动化挑战的调查与分析","authors":"Joan Lu, Bhavya Krishna Balasubramanian, Mike Joy, Qiang Xu","doi":"10.1145/3748521","DOIUrl":null,"url":null,"abstract":"Assessment is essential to educational system. Automatic grading reduces the time and effort taken by tutors to assess the answers written by the students. To understand recent computational methods used for automatic grading, a review has been conducted. 4084 papers were initially identified using a keyword search. After filtering, the number was reduced to 57. It was found that statistical models are normally used in Automatic-Short-Answer-Grading (ASAG); vector-based similarity measures are the most popular among projects; pilot datasets are mostly used; standard datasets for evaluation are missing. Evidence shows that machine learning and deep learning are most popularly adopted methods and generative AI, e.g., LLMs and ChatGPT are also jump to the chance, which indicates that integrating AI in education is an inevitable trend. Also, most investigations prefer to adopt multiple approaches to improve computational quality, dataset analysis, and evaluation results. The identified research gaps will be a useful reference guide to users/researchers beneficial to formative/summative assessment. We concluded that the presented outcome, analysis and discussions are informative to academia and pedagogical practitioners who are interested in further developing/using ASAG systems. Although research into ASAG is still rudimentary, it is a promising area with impact on academic circles/commercially educational markets.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"13 1","pages":""},"PeriodicalIF":28.0000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Survey and Analysis for the Challenges in Computer Science to the Automation of Grading Systems\",\"authors\":\"Joan Lu, Bhavya Krishna Balasubramanian, Mike Joy, Qiang Xu\",\"doi\":\"10.1145/3748521\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Assessment is essential to educational system. Automatic grading reduces the time and effort taken by tutors to assess the answers written by the students. To understand recent computational methods used for automatic grading, a review has been conducted. 4084 papers were initially identified using a keyword search. After filtering, the number was reduced to 57. It was found that statistical models are normally used in Automatic-Short-Answer-Grading (ASAG); vector-based similarity measures are the most popular among projects; pilot datasets are mostly used; standard datasets for evaluation are missing. Evidence shows that machine learning and deep learning are most popularly adopted methods and generative AI, e.g., LLMs and ChatGPT are also jump to the chance, which indicates that integrating AI in education is an inevitable trend. Also, most investigations prefer to adopt multiple approaches to improve computational quality, dataset analysis, and evaluation results. The identified research gaps will be a useful reference guide to users/researchers beneficial to formative/summative assessment. We concluded that the presented outcome, analysis and discussions are informative to academia and pedagogical practitioners who are interested in further developing/using ASAG systems. Although research into ASAG is still rudimentary, it is a promising area with impact on academic circles/commercially educational markets.\",\"PeriodicalId\":50926,\"journal\":{\"name\":\"ACM Computing Surveys\",\"volume\":\"13 1\",\"pages\":\"\"},\"PeriodicalIF\":28.0000,\"publicationDate\":\"2025-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Computing Surveys\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3748521\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3748521","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

评估对教育系统至关重要。自动评分减少了导师评估学生所写答案的时间和精力。为了了解最近用于自动评分的计算方法，我们进行了一次回顾。4084篇论文最初是通过关键词搜索确定的。过滤后，减少到57条。研究发现，在自动简答评分（ASAG）中，通常使用统计模型；基于向量的相似性度量在项目中最为流行；主要使用试点数据集；缺少用于评估的标准数据集。有证据表明，机器学习和深度学习是最受欢迎的方法，而生成式人工智能，如llm和ChatGPT也跃然其中，这表明将人工智能融入教育是一个必然的趋势。此外，大多数研究倾向于采用多种方法来提高计算质量、数据集分析和评估结果。已确定的研究差距将成为有益于形成性/总结性评估的用户/研究人员的有用参考指南。我们的结论是，所呈现的结果、分析和讨论对有兴趣进一步开发/使用ASAG系统的学术界和教学从业者具有参考价值。尽管对ASAG的研究仍处于初级阶段，但它是一个有前景的领域，对学术界/商业教育市场都有影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Survey and Analysis for the Challenges in Computer Science to the Automation of Grading Systems

Assessment is essential to educational system. Automatic grading reduces the time and effort taken by tutors to assess the answers written by the students. To understand recent computational methods used for automatic grading, a review has been conducted. 4084 papers were initially identified using a keyword search. After filtering, the number was reduced to 57. It was found that statistical models are normally used in Automatic-Short-Answer-Grading (ASAG); vector-based similarity measures are the most popular among projects; pilot datasets are mostly used; standard datasets for evaluation are missing. Evidence shows that machine learning and deep learning are most popularly adopted methods and generative AI, e.g., LLMs and ChatGPT are also jump to the chance, which indicates that integrating AI in education is an inevitable trend. Also, most investigations prefer to adopt multiple approaches to improve computational quality, dataset analysis, and evaluation results. The identified research gaps will be a useful reference guide to users/researchers beneficial to formative/summative assessment. We concluded that the presented outcome, analysis and discussions are informative to academia and pedagogical practitioners who are interested in further developing/using ASAG systems. Although research into ASAG is still rudimentary, it is a promising area with impact on academic circles/commercially educational markets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Computing Surveys 工程技术-计算机：理论方法

CiteScore

33.20

自引率

0.60%

发文量

372

审稿时长

12 months

期刊介绍： ACM Computing Surveys is an academic journal that focuses on publishing surveys and tutorials on various areas of computing research and practice. The journal aims to provide comprehensive and easily understandable articles that guide readers through the literature and help them understand topics outside their specialties. In terms of impact, CSUR has a high reputation with a 2022 Impact Factor of 16.6. It is ranked 3rd out of 111 journals in the field of Computer Science Theory & Methods. ACM Computing Surveys is indexed and abstracted in various services, including AI2 Semantic Scholar, Baidu, Clarivate/ISI: JCR, CNKI, DeepDyve, DTU, EBSCO: EDS/HOST, and IET Inspec, among others.