Challenges and opportunities of automated essay scoring for low-proficient L2 English writers

IF 5.5 1区文学 Q1 EDUCATION & EDUCATIONAL RESEARCH

Assessing Writing Pub Date : 2025-09-20 DOI:10.1016/j.asw.2025.100982

Vanessa De Wilde , Orphée De Clercq

{"title":"Challenges and opportunities of automated essay scoring for low-proficient L2 English writers","authors":"Vanessa De Wilde , Orphée De Clercq","doi":"10.1016/j.asw.2025.100982","DOIUrl":null,"url":null,"abstract":"<div><div>Assessing students’ writing can be a challenging activity. To make writing assessment more feasible, researchers have investigated the possibilities of automated essay scoring (AES). Most studies investigating AES have focused on L1 writing or intermediate to advanced L2 writing. In this study we explored the possibilities of using AES with low proficiency L2 English writers. We used a dataset which comprised writing samples from 3166 young L2 English learners who were at the very start of L2 English instruction. All tasks received a score assigned by humans.</div><div>For automated scoring we experimented with two machine learning methods. First, a feature-based approach for which the dataset was linguistically preprocessed using natural language processing tools. The second approach employed deep learning by fine-tuning various large language models. Because we were particularly interested in the influence of spelling errors, we also created a corrected, spell-checked version of our dataset.</div><div>Models trained on the uncorrected samples yield the best results. Especially the deep learning approach leads to a satisfying performance with a quadratic weighted kappa above .70. The model which was fine-tuned on an underlying Dutch large language model was superior, which might be linked to the low L2 English proficiency of the young L1 Dutch writers in our sample.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100982"},"PeriodicalIF":5.5000,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Assessing Writing","FirstCategoryId":"98","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1075293525000698","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}

引用次数: 0

Abstract

Assessing students’ writing can be a challenging activity. To make writing assessment more feasible, researchers have investigated the possibilities of automated essay scoring (AES). Most studies investigating AES have focused on L1 writing or intermediate to advanced L2 writing. In this study we explored the possibilities of using AES with low proficiency L2 English writers. We used a dataset which comprised writing samples from 3166 young L2 English learners who were at the very start of L2 English instruction. All tasks received a score assigned by humans.

For automated scoring we experimented with two machine learning methods. First, a feature-based approach for which the dataset was linguistically preprocessed using natural language processing tools. The second approach employed deep learning by fine-tuning various large language models. Because we were particularly interested in the influence of spelling errors, we also created a corrected, spell-checked version of our dataset.

Models trained on the uncorrected samples yield the best results. Especially the deep learning approach leads to a satisfying performance with a quadratic weighted kappa above .70. The model which was fine-tuned on an underlying Dutch large language model was superior, which might be linked to the low L2 English proficiency of the young L1 Dutch writers in our sample.

查看原文本刊更多论文

对低熟练程度的第二语言英语作家的自动作文评分的挑战和机遇

评估学生的写作是一项具有挑战性的活动。为了使写作评估更加可行，研究人员研究了自动论文评分（AES）的可能性。大多数调查AES的研究都集中在第一语言写作或中级到高级第二语言写作上。在本研究中，我们探讨了对低熟练程度的二语作者使用AES的可能性。我们使用了一个数据集，其中包括来自3166名年轻的第二语言英语学习者的写作样本，他们都是在第二语言英语教学的开始。所有的任务都会收到一个由人类分配的分数。对于自动评分，我们尝试了两种机器学习方法。首先，基于特征的方法，使用自然语言处理工具对数据集进行语言预处理。第二种方法采用深度学习，对各种大型语言模型进行微调。因为我们对拼写错误的影响特别感兴趣，所以我们还创建了一个经过拼写检查的更正版本的数据集。在未校正的样本上训练的模型产生最好的结果。尤其是深度学习方法，其二次加权kappa值在0.70以上，取得了令人满意的效果。在潜在的荷兰语大语言模型上进行微调的模型更优越，这可能与我们样本中年轻的母语荷兰语作家的第二语言英语熟练程度较低有关。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Assessing Writing Multiple-

CiteScore

6.00

自引率

17.90%

发文量

期刊介绍： Assessing Writing is a refereed international journal providing a forum for ideas, research and practice on the assessment of written language. Assessing Writing publishes articles, book reviews, conference reports, and academic exchanges concerning writing assessments of all kinds, including traditional (direct and standardised forms of) testing of writing, alternative performance assessments (such as portfolios), workplace sampling and classroom assessment. The journal focuses on all stages of the writing assessment process, including needs evaluation, assessment creation, implementation, and validation, and test development.