Nivja H. de Jong , Stephan Raaijmakers , Dineke Tigelaar
{"title":"Developing high-quality, practical, and ethical automated L2 speaking assessments","authors":"Nivja H. de Jong , Stephan Raaijmakers , Dineke Tigelaar","doi":"10.1016/j.system.2025.103796","DOIUrl":null,"url":null,"abstract":"<div><div>To foster second language (L2) learners’ speaking abilities, practicing speaking regularly is necessary, including regular assessments and individual feedback to learners. However, in current classroom settings, practicing and assessing speaking are often neglected. For teachers, it is particularly hard to provide individualized feedback on speaking, speaking being a loud and transient phenomenon. Additionally, recording speech and thus providing individual assessments and feedback based on recordings is highly time-consuming. Therefore, to alleviate the amount of work needed for individual assessments, teachers and learners would be helped with automated speaking assessments. In this paper, we first describe the requirements for high-quality, practical, and ethical tools for automated scoring of and feedback on L2 speaking performances. Subsequently, we describe and evaluate existing tools of automated L2 speaking assessment. We conclude that none of the described tools meet all the identified requirements. Combining insights from the AI-based assessment framework (Fang et al., 2023) with an educational design approach, we offer recommendations intended to guide computational linguists together with researchers and practitioners in education and assessment on how to successfully integrate computational research with educational design. The goal of such future research is to develop generative AI (GenAI)-based systems that are technically sound, ethically responsible, and likely to be adopted in educational practice.</div></div>","PeriodicalId":48185,"journal":{"name":"System","volume":"134 ","pages":"Article 103796"},"PeriodicalIF":5.6000,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"System","FirstCategoryId":"98","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0346251X25002064","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0
Abstract
To foster second language (L2) learners’ speaking abilities, practicing speaking regularly is necessary, including regular assessments and individual feedback to learners. However, in current classroom settings, practicing and assessing speaking are often neglected. For teachers, it is particularly hard to provide individualized feedback on speaking, speaking being a loud and transient phenomenon. Additionally, recording speech and thus providing individual assessments and feedback based on recordings is highly time-consuming. Therefore, to alleviate the amount of work needed for individual assessments, teachers and learners would be helped with automated speaking assessments. In this paper, we first describe the requirements for high-quality, practical, and ethical tools for automated scoring of and feedback on L2 speaking performances. Subsequently, we describe and evaluate existing tools of automated L2 speaking assessment. We conclude that none of the described tools meet all the identified requirements. Combining insights from the AI-based assessment framework (Fang et al., 2023) with an educational design approach, we offer recommendations intended to guide computational linguists together with researchers and practitioners in education and assessment on how to successfully integrate computational research with educational design. The goal of such future research is to develop generative AI (GenAI)-based systems that are technically sound, ethically responsible, and likely to be adopted in educational practice.
为了培养第二语言学习者的口语能力,定期练习口语是必要的,包括定期评估和对学习者的个人反馈。然而,在目前的课堂环境中,练习和评估口语往往被忽视。对于教师来说,提供个性化的演讲反馈尤其困难,因为演讲是一种响亮而短暂的现象。此外,录制语音并根据录音提供个人评估和反馈是非常耗时的。因此,为了减轻个人评估所需的工作量,教师和学习者将得到自动口语评估的帮助。在本文中,我们首先描述了对高质量、实用和道德的工具的要求,用于对第二语言口语表现进行自动评分和反馈。随后,我们描述和评估现有的二语口语自动评估工具。我们得出结论,所描述的工具都不能满足所有确定的需求。结合基于人工智能的评估框架(Fang et al., 2023)与教育设计方法的见解,我们提出了一些建议,旨在指导计算语言学家、教育和评估领域的研究人员和实践者如何成功地将计算研究与教育设计结合起来。这种未来研究的目标是开发基于生成式人工智能(GenAI)的系统,这些系统在技术上是合理的,在伦理上是负责任的,并且可能被用于教育实践。
期刊介绍:
This international journal is devoted to the applications of educational technology and applied linguistics to problems of foreign language teaching and learning. Attention is paid to all languages and to problems associated with the study and teaching of English as a second or foreign language. The journal serves as a vehicle of expression for colleagues in developing countries. System prefers its contributors to provide articles which have a sound theoretical base with a visible practical application which can be generalized. The review section may take up works of a more theoretical nature to broaden the background.