Evaluating an AI speaking assessment tool: Score accuracy, perceived validity, and oral peer feedback as feedback enhancement

IF 3.1 1区 文学 Q1 EDUCATION & EDUCATIONAL RESEARCH
Xu Jared Liu , Jingwen Wang , Bin Zou
{"title":"Evaluating an AI speaking assessment tool: Score accuracy, perceived validity, and oral peer feedback as feedback enhancement","authors":"Xu Jared Liu ,&nbsp;Jingwen Wang ,&nbsp;Bin Zou","doi":"10.1016/j.jeap.2025.101505","DOIUrl":null,"url":null,"abstract":"<div><div>Artificial Intelligence (AI) has significantly transformed language learning approaches and outcomes. However, research on AI-assisted English for Academic Purposes (EAP) speaking classrooms remains sparse. This study evaluates \"EAP Talk\", an AI-assisted speaking assessment tool, examining its effectiveness in two contexts: controlled tasks (Reading Aloud) that elicit non-spontaneous speech, and uncontrolled tasks (Presentation) that generate spontaneous speech. The research assessed accuracy and validity of EAP Talk scores through analysing 20 Reading Aloud and 20 Presentation recordings randomly selected from a pool of 64 undergraduate students. These recordings were graded by five experienced EAP teachers using Adaptive Comparative Judgment (ACJ) – a comparative scoring method – and the traditional rubric rating approach. Acknowledging the limitation of EAP Talk in providing scores without detailed feedback, the study further investigated its perceived validity and examined oral peer feedback as a complementary enhancement strategy. Semi-structured interviews with four students were conducted to investigate their perceptions of the AI-assisted assessment process, focusing on the benefits of EAP Talk in enhancing learning, its limitations, and the effectiveness of oral peer feedback. Scoring concordance analysis shows that EAP Talk performs well in the controlled task but less so in the uncontrolled one. Content analysis on the interview data reveals that EAP Talk facilitates student confidence and positively shapes learning styles, while oral peer feedback markedly improves speaking skills through effective human-computer collaboration. The study calls for more precise AI assessments in uncontrolled tasks and proposes pedagogical strategies to better integrate AI into EAP speaking contexts.</div></div>","PeriodicalId":47717,"journal":{"name":"Journal of English for Academic Purposes","volume":"75 ","pages":"Article 101505"},"PeriodicalIF":3.1000,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of English for Academic Purposes","FirstCategoryId":"98","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1475158525000360","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0

Abstract

Artificial Intelligence (AI) has significantly transformed language learning approaches and outcomes. However, research on AI-assisted English for Academic Purposes (EAP) speaking classrooms remains sparse. This study evaluates "EAP Talk", an AI-assisted speaking assessment tool, examining its effectiveness in two contexts: controlled tasks (Reading Aloud) that elicit non-spontaneous speech, and uncontrolled tasks (Presentation) that generate spontaneous speech. The research assessed accuracy and validity of EAP Talk scores through analysing 20 Reading Aloud and 20 Presentation recordings randomly selected from a pool of 64 undergraduate students. These recordings were graded by five experienced EAP teachers using Adaptive Comparative Judgment (ACJ) – a comparative scoring method – and the traditional rubric rating approach. Acknowledging the limitation of EAP Talk in providing scores without detailed feedback, the study further investigated its perceived validity and examined oral peer feedback as a complementary enhancement strategy. Semi-structured interviews with four students were conducted to investigate their perceptions of the AI-assisted assessment process, focusing on the benefits of EAP Talk in enhancing learning, its limitations, and the effectiveness of oral peer feedback. Scoring concordance analysis shows that EAP Talk performs well in the controlled task but less so in the uncontrolled one. Content analysis on the interview data reveals that EAP Talk facilitates student confidence and positively shapes learning styles, while oral peer feedback markedly improves speaking skills through effective human-computer collaboration. The study calls for more precise AI assessments in uncontrolled tasks and proposes pedagogical strategies to better integrate AI into EAP speaking contexts.
评估人工智能口语评估工具:评分准确性、感知有效性和作为反馈强化的口头同伴反馈
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.60
自引率
13.30%
发文量
81
审稿时长
57 days
期刊介绍: The Journal of English for Academic Purposes provides a forum for the dissemination of information and views which enables practitioners of and researchers in EAP to keep current with developments in their field and to contribute to its continued updating. JEAP publishes articles, book reviews, conference reports, and academic exchanges in the linguistic, sociolinguistic and psycholinguistic description of English as it occurs in the contexts of academic study and scholarly exchange itself.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信