Does ChatGPT enhance student learning? A systematic review and meta-analysis of experimental studies

IF 8.9 1区 教育学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Ruiqi Deng , Maoli Jiang , Xinlu Yu , Yuyan Lu , Shasha Liu
{"title":"Does ChatGPT enhance student learning? A systematic review and meta-analysis of experimental studies","authors":"Ruiqi Deng ,&nbsp;Maoli Jiang ,&nbsp;Xinlu Yu ,&nbsp;Yuyan Lu ,&nbsp;Shasha Liu","doi":"10.1016/j.compedu.2024.105224","DOIUrl":null,"url":null,"abstract":"<div><div>Chat Generative Pre-Trained Transformer (ChatGPT) has generated excitement and concern in education. While cross-sectional studies have highlighted correlations between ChatGPT use and learning performance, they fall short of establishing causality. This review examines experimental studies on ChatGPT's impact on student learning to address this gap. A comprehensive search across five databases identified 69 articles published between 2022 and 2024 for analysis. The findings reveal that ChatGPT interventions are predominantly implemented at the university level, cover various subject areas focusing on language education, are integrated into classroom environments as part of regular educational practices, and primarily involve direct student use of ChatGPT. Overall, ChatGPT <em>improves</em> academic performance, affective-motivational states, and higher-order thinking propensities; it <em>reduces</em> mental effort and has <em>no</em> significant effect on self-efficacy. However, methodological limitations, such as the lack of power analysis and concerns regarding post-intervention assessments, warrant cautious interpretation of results. This review presents four propositions from the findings: (1) distinguish between the quality of ChatGPT outputs and the positive effects of interventions on academic performance by shifting from well-defined problems in post-intervention assessments to more complex, project-based assessments that require skill demonstration, adopting proctored assessments, or incorporating metrics such as originality alongside quality; (2) evaluate long-term impacts to determine whether the positive effects on affective-motivational states are sustained or merely owing to novelty effect; (3) prioritise objective measures to complement subjective assessments of higher-order thinking; and (4) use power analysis to determine adequate sample sizes to avoid Type II errors and provide reliable effect size estimates. This review provides valuable insights for researchers, instructors, and policymakers evaluating the effectiveness of generative AI integration in educational practice.</div></div>","PeriodicalId":10568,"journal":{"name":"Computers & Education","volume":"227 ","pages":"Article 105224"},"PeriodicalIF":8.9000,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Education","FirstCategoryId":"95","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0360131524002380","RegionNum":1,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Chat Generative Pre-Trained Transformer (ChatGPT) has generated excitement and concern in education. While cross-sectional studies have highlighted correlations between ChatGPT use and learning performance, they fall short of establishing causality. This review examines experimental studies on ChatGPT's impact on student learning to address this gap. A comprehensive search across five databases identified 69 articles published between 2022 and 2024 for analysis. The findings reveal that ChatGPT interventions are predominantly implemented at the university level, cover various subject areas focusing on language education, are integrated into classroom environments as part of regular educational practices, and primarily involve direct student use of ChatGPT. Overall, ChatGPT improves academic performance, affective-motivational states, and higher-order thinking propensities; it reduces mental effort and has no significant effect on self-efficacy. However, methodological limitations, such as the lack of power analysis and concerns regarding post-intervention assessments, warrant cautious interpretation of results. This review presents four propositions from the findings: (1) distinguish between the quality of ChatGPT outputs and the positive effects of interventions on academic performance by shifting from well-defined problems in post-intervention assessments to more complex, project-based assessments that require skill demonstration, adopting proctored assessments, or incorporating metrics such as originality alongside quality; (2) evaluate long-term impacts to determine whether the positive effects on affective-motivational states are sustained or merely owing to novelty effect; (3) prioritise objective measures to complement subjective assessments of higher-order thinking; and (4) use power analysis to determine adequate sample sizes to avoid Type II errors and provide reliable effect size estimates. This review provides valuable insights for researchers, instructors, and policymakers evaluating the effectiveness of generative AI integration in educational practice.
ChatGPT能促进学生的学习吗?实验研究的系统回顾和荟萃分析
聊天生成预训练转换器(ChatGPT)在教育领域引起了人们的兴奋和关注。虽然横断面研究强调了ChatGPT使用与学习表现之间的相关性,但它们没有建立因果关系。本文回顾了ChatGPT对学生学习影响的实验研究,以解决这一差距。对五个数据库进行全面搜索,确定了2022年至2024年间发表的69篇文章进行分析。研究结果表明,ChatGPT干预措施主要在大学层面实施,涵盖以语言教育为重点的各个学科领域,作为常规教育实践的一部分融入课堂环境,主要涉及学生直接使用ChatGPT。总体而言,ChatGPT提高了学习成绩、情感动机状态和高阶思维倾向;它减少了脑力劳动,对自我效能感没有显著影响。然而,方法上的限制,如缺乏功效分析和对干预后评估的关注,需要谨慎解释结果。本综述从研究结果中提出了四个建议:(1)区分ChatGPT输出的质量和干预对学业成绩的积极影响,方法是从干预后评估中明确定义的问题转向更复杂的、基于项目的评估(需要技能演示),采用监督评估,或将原创性和质量等指标结合起来;(2)评估长期影响,以确定对情感动机状态的积极影响是持续的还是仅仅是由于新颖性效应;(3)优先考虑客观措施,以补充高阶思维的主观评估;(4)使用功率分析来确定足够的样本量,以避免第二类误差并提供可靠的效应量估计。这篇综述为研究人员、教师和政策制定者评估生成式人工智能集成在教育实践中的有效性提供了有价值的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computers & Education
Computers & Education 工程技术-计算机:跨学科应用
CiteScore
27.10
自引率
5.80%
发文量
204
审稿时长
42 days
期刊介绍: Computers & Education seeks to advance understanding of how digital technology can improve education by publishing high-quality research that expands both theory and practice. The journal welcomes research papers exploring the pedagogical applications of digital technology, with a focus broad enough to appeal to the wider education community.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信