Language assessment in the era of generative artificial intelligence: Opportunities, challenges, and future directions

IF 5.6 1区文学 Q1 EDUCATION & EDUCATIONAL RESEARCH

System Pub Date : 2025-09-20 DOI:10.1016/j.system.2025.103846

Ping-Lin Chuang , Xun Yan

{"title":"Language assessment in the era of generative artificial intelligence: Opportunities, challenges, and future directions","authors":"Ping-Lin Chuang , Xun Yan","doi":"10.1016/j.system.2025.103846","DOIUrl":null,"url":null,"abstract":"<div><div>Recent breakthroughs in generative artificial intelligence (GenAI) have shaken the field of language assessment in unprecedented ways. On one hand, large-scale testing companies and organizations have spearheaded AI-based assessment models, infusing GenAI elements into various test development and quality management procedures. On the other hand, in local institutional contexts, although the interest in incorporating AI-based elements in instructional programs has also been soaring, little discussion has been made regarding the affordances and affordability of AI-based assessment systems in local contexts, and the impact of AI on language assessment across contexts remains underexplored. This paper presents a systematic review of 77 articles published on GenAI use in language testing and assessment, to synthesize assessment challenges, opportunities, and solutions involving GenAI. We review issues of GenAI in the domains of reliability, validity, fairness, and practicality in large-scale, local, and classroom-based assessments. Based on the findings from the systematic review, we identify principles and best practices of language assessment in the age of GenAI. Additionally, we provide future directions for assessment research and practice related to GenAI, with a focus on impact and sustainability of GenAI in language assessment across contexts. We caution researchers and practitioners about the danger of following the trend of incorporating GenAI in an unmeasured and unselective manner.</div></div>","PeriodicalId":48185,"journal":{"name":"System","volume":"134 ","pages":"Article 103846"},"PeriodicalIF":5.6000,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"System","FirstCategoryId":"98","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0346251X25002568","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}

引用次数: 0

Abstract

Recent breakthroughs in generative artificial intelligence (GenAI) have shaken the field of language assessment in unprecedented ways. On one hand, large-scale testing companies and organizations have spearheaded AI-based assessment models, infusing GenAI elements into various test development and quality management procedures. On the other hand, in local institutional contexts, although the interest in incorporating AI-based elements in instructional programs has also been soaring, little discussion has been made regarding the affordances and affordability of AI-based assessment systems in local contexts, and the impact of AI on language assessment across contexts remains underexplored. This paper presents a systematic review of 77 articles published on GenAI use in language testing and assessment, to synthesize assessment challenges, opportunities, and solutions involving GenAI. We review issues of GenAI in the domains of reliability, validity, fairness, and practicality in large-scale, local, and classroom-based assessments. Based on the findings from the systematic review, we identify principles and best practices of language assessment in the age of GenAI. Additionally, we provide future directions for assessment research and practice related to GenAI, with a focus on impact and sustainability of GenAI in language assessment across contexts. We caution researchers and practitioners about the danger of following the trend of incorporating GenAI in an unmeasured and unselective manner.

查看原文本刊更多论文

生成式人工智能时代的语言评估：机遇、挑战与未来方向

生成式人工智能（GenAI）的最新突破以前所未有的方式震撼了语言评估领域。一方面，大型测试公司和组织率先采用了基于人工智能的评估模型，将GenAI元素注入到各种测试开发和质量管理程序中。另一方面，在当地的机构环境中，尽管将基于人工智能的元素纳入教学计划的兴趣也在飙升，但关于基于人工智能的评估系统在当地环境中的可负担性和可负担性的讨论很少，人工智能对跨环境语言评估的影响仍未得到充分探讨。本文系统地回顾了77篇关于GenAI在语言测试和评估中的应用的文章，以综合GenAI在评估中的挑战、机遇和解决方案。我们回顾了GenAI在大规模、本地和课堂评估中的可靠性、有效性、公平性和实用性等领域的问题。基于系统综述的结果，我们确定了GenAI时代语言评估的原则和最佳实践。此外，我们还提供了与GenAI相关的评估研究和实践的未来方向，重点关注GenAI在跨上下文语言评估中的影响和可持续性。我们提醒研究人员和从业人员，以一种不加测量和不加选择的方式加入GenAI的趋势是危险的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

System Multiple-

CiteScore

8.80

自引率

8.30%

发文量

202

审稿时长

64 days

期刊介绍： This international journal is devoted to the applications of educational technology and applied linguistics to problems of foreign language teaching and learning. Attention is paid to all languages and to problems associated with the study and teaching of English as a second or foreign language. The journal serves as a vehicle of expression for colleagues in developing countries. System prefers its contributors to provide articles which have a sound theoretical base with a visible practical application which can be generalized. The review section may take up works of a more theoretical nature to broaden the background.