A survey on deep learning-based automated essay scoring and feedback generation

IF 10.7 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence Review Pub Date : 2024-12-20 DOI:10.1007/s10462-024-11017-5

Haile Misgna, Byung-Won On, Ingyu Lee, Gyu Sang Choi

{"title":"A survey on deep learning-based automated essay scoring and feedback generation","authors":"Haile Misgna, Byung-Won On, Ingyu Lee, Gyu Sang Choi","doi":"10.1007/s10462-024-11017-5","DOIUrl":null,"url":null,"abstract":"<div><p>Deep learning-based automated essay scoring (AES) models exhibit a remarkable ability to identify complex patterns within essays and then generate accurate score predictions in an end-to-end training fashion. However, these models face a critical limitation in explaining the specific patterns and features utilized for scoring, which are essential for interpreting the scores and offering constructive feedback to essay authors. Numerous studies have focused on essay scoring, with the aim of modeling prompt-specific, domain-adaptable, or trait-specific AES. While existing surveys on AES cover topics ranging from representation to scoring models, they primarily emphasize scoring models. This study addresses a crucial gap by encompassing research on feedback generation for essay assessment tasks. By delving into essay scoring and feedback generation, we synthesize several existing literature to provide readers with a comprehensive understanding of ongoing research in both deep learning-based essay scoring and automated feedback generation. We categorized the existing essay scoring studies into prompt-specific and cross-prompt AES models, noting that prompt-specific AES is extensively researched category. However, we have only come across a few studies concerning automated feedback generation, likely because of the limited availability of suitable datasets for researching such types of tasks. Moreover, this survey provides insights into approaches for essay representation, prevalent datasets, evaluation metrics, and challenges in automated essay scoring tasks. By shedding light on these aspects, our goal is to delineate the current landscape, identify key research directions, and pave the way for further advancements in automated essay assessment.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 2","pages":""},"PeriodicalIF":10.7000,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-11017-5.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-024-11017-5","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning-based automated essay scoring (AES) models exhibit a remarkable ability to identify complex patterns within essays and then generate accurate score predictions in an end-to-end training fashion. However, these models face a critical limitation in explaining the specific patterns and features utilized for scoring, which are essential for interpreting the scores and offering constructive feedback to essay authors. Numerous studies have focused on essay scoring, with the aim of modeling prompt-specific, domain-adaptable, or trait-specific AES. While existing surveys on AES cover topics ranging from representation to scoring models, they primarily emphasize scoring models. This study addresses a crucial gap by encompassing research on feedback generation for essay assessment tasks. By delving into essay scoring and feedback generation, we synthesize several existing literature to provide readers with a comprehensive understanding of ongoing research in both deep learning-based essay scoring and automated feedback generation. We categorized the existing essay scoring studies into prompt-specific and cross-prompt AES models, noting that prompt-specific AES is extensively researched category. However, we have only come across a few studies concerning automated feedback generation, likely because of the limited availability of suitable datasets for researching such types of tasks. Moreover, this survey provides insights into approaches for essay representation, prevalent datasets, evaluation metrics, and challenges in automated essay scoring tasks. By shedding light on these aspects, our goal is to delineate the current landscape, identify key research directions, and pave the way for further advancements in automated essay assessment.

查看原文本刊更多论文

基于深度学习的自动作文评分和反馈生成研究综述

基于深度学习的自动论文评分（AES）模型显示出识别论文中复杂模式的卓越能力，然后以端到端训练方式生成准确的分数预测。然而，这些模型在解释用于评分的特定模式和特征方面面临着一个关键的限制，这对于解释分数和向论文作者提供建设性的反馈至关重要。许多研究都集中在论文评分上，目的是建立特定于提示、领域适应性强或特定于特征的AES模型。虽然现有的AES调查涵盖了从表征到评分模型的主题，但它们主要强调的是评分模型。这项研究解决了一个关键的差距，包括对论文评估任务的反馈产生的研究。通过深入研究论文评分和反馈生成，我们综合了一些现有的文献，为读者提供对基于深度学习的论文评分和自动反馈生成的正在进行的研究的全面了解。我们将现有的论文评分研究分为特定提示的AES模型和跨提示的AES模型，注意到特定提示的AES是被广泛研究的类别。然而，我们只遇到了一些关于自动反馈生成的研究，可能是因为研究这类任务的合适数据集的可用性有限。此外，本调查还提供了对论文表示、流行数据集、评估指标和自动论文评分任务中的挑战的见解。通过阐明这些方面，我们的目标是描绘当前的格局，确定关键的研究方向，并为自动化论文评估的进一步发展铺平道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Artificial Intelligence Review 工程技术-计算机：人工智能

CiteScore

22.00

自引率

3.30%

发文量

194

审稿时长

5.3 months

期刊介绍： Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.