研究 ChatGPT-3.5 在中国小学教育环境中的辅导效果

IF 2.9 3区 教育学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Yu Bai;Jun Li;Jun Shen;Liang Zhao
{"title":"研究 ChatGPT-3.5 在中国小学教育环境中的辅导效果","authors":"Yu Bai;Jun Li;Jun Shen;Liang Zhao","doi":"10.1109/TLT.2024.3464560","DOIUrl":null,"url":null,"abstract":"The potential of artificial intelligence (AI) in transforming education has received considerable attention. This study aims to explore the potential of large language models (LLMs) in assisting students with studying and passing standardized exams, while many people think it is a hype situation. Using primary education as an example, this research investigates whether ChatGPT-3.5 can achieve satisfactory performance on the Chinese Primary School Exams and whether it can be used as a teaching aid or tutor. We designed an experimental framework and constructed a benchmark that comprises 4800 questions collected from 48 tasks in Chinese elementary education settings. Through automatic and manual evaluations, we observed that ChatGPT-3.5’s pass rate was below the required level of accuracy for most tasks, and the correctness of ChatGPT-3.5’s answer interpretation was unsatisfactory. These results revealed a discrepancy between the findings and our initial expectations. However, the comparative experiments between ChatGPT-3.5 and ChatGPT-4 indicated significant improvements in model performance, demonstrating the potential of using LLMs as a teaching aid. This article also investigates the use of the trans-prompting strategy to reduce the impact of language bias and enhance question understanding. We present a comparison of the models' performance and the improvement under the trans-lingual problem decomposition prompting mechanism. Finally, we discuss the challenges associated with the appropriate application of AI-driven language models, along with future directions and limitations in the field of AI for education.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"17 ","pages":"2156-2171"},"PeriodicalIF":2.9000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Investigating the Efficacy of ChatGPT-3.5 for Tutoring in Chinese Elementary Education Settings\",\"authors\":\"Yu Bai;Jun Li;Jun Shen;Liang Zhao\",\"doi\":\"10.1109/TLT.2024.3464560\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The potential of artificial intelligence (AI) in transforming education has received considerable attention. This study aims to explore the potential of large language models (LLMs) in assisting students with studying and passing standardized exams, while many people think it is a hype situation. Using primary education as an example, this research investigates whether ChatGPT-3.5 can achieve satisfactory performance on the Chinese Primary School Exams and whether it can be used as a teaching aid or tutor. We designed an experimental framework and constructed a benchmark that comprises 4800 questions collected from 48 tasks in Chinese elementary education settings. Through automatic and manual evaluations, we observed that ChatGPT-3.5’s pass rate was below the required level of accuracy for most tasks, and the correctness of ChatGPT-3.5’s answer interpretation was unsatisfactory. These results revealed a discrepancy between the findings and our initial expectations. However, the comparative experiments between ChatGPT-3.5 and ChatGPT-4 indicated significant improvements in model performance, demonstrating the potential of using LLMs as a teaching aid. This article also investigates the use of the trans-prompting strategy to reduce the impact of language bias and enhance question understanding. We present a comparison of the models' performance and the improvement under the trans-lingual problem decomposition prompting mechanism. Finally, we discuss the challenges associated with the appropriate application of AI-driven language models, along with future directions and limitations in the field of AI for education.\",\"PeriodicalId\":49191,\"journal\":{\"name\":\"IEEE Transactions on Learning Technologies\",\"volume\":\"17 \",\"pages\":\"2156-2171\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Learning Technologies\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10684453/\",\"RegionNum\":3,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Learning Technologies","FirstCategoryId":"95","ListUrlMain":"https://ieeexplore.ieee.org/document/10684453/","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

人工智能(AI)在改变教育方面的潜力已受到广泛关注。本研究旨在探索大型语言模型(LLM)在帮助学生学习和通过标准化考试方面的潜力,而很多人认为这是一种炒作情况。本研究以小学教育为例,探讨 ChatGPT-3.5 是否能在中国小学考试中取得令人满意的成绩,以及是否可用作教学辅助工具或辅导工具。我们设计了一个实验框架,并构建了一个基准,其中包括从中国小学教育环境中的 48 个任务中收集的 4800 道题。通过自动和人工评估,我们发现 ChatGPT-3.5 的通过率在大多数任务中都低于要求的准确率,而且 ChatGPT-3.5 的答案解释正确率也不尽如人意。这些结果表明,实验结果与我们最初的预期存在差异。不过,ChatGPT-3.5 和 ChatGPT-4 的对比实验表明,模型性能有了显著提高,这证明了使用 LLM 作为教学辅助工具的潜力。本文还研究了如何使用反向提示策略来减少语言偏差的影响并增强对问题的理解。我们比较了模型的性能以及在跨语言问题分解提示机制下的改进情况。最后,我们讨论了适当应用人工智能驱动的语言模型所面临的挑战,以及人工智能教育领域的未来发展方向和局限性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Investigating the Efficacy of ChatGPT-3.5 for Tutoring in Chinese Elementary Education Settings
The potential of artificial intelligence (AI) in transforming education has received considerable attention. This study aims to explore the potential of large language models (LLMs) in assisting students with studying and passing standardized exams, while many people think it is a hype situation. Using primary education as an example, this research investigates whether ChatGPT-3.5 can achieve satisfactory performance on the Chinese Primary School Exams and whether it can be used as a teaching aid or tutor. We designed an experimental framework and constructed a benchmark that comprises 4800 questions collected from 48 tasks in Chinese elementary education settings. Through automatic and manual evaluations, we observed that ChatGPT-3.5’s pass rate was below the required level of accuracy for most tasks, and the correctness of ChatGPT-3.5’s answer interpretation was unsatisfactory. These results revealed a discrepancy between the findings and our initial expectations. However, the comparative experiments between ChatGPT-3.5 and ChatGPT-4 indicated significant improvements in model performance, demonstrating the potential of using LLMs as a teaching aid. This article also investigates the use of the trans-prompting strategy to reduce the impact of language bias and enhance question understanding. We present a comparison of the models' performance and the improvement under the trans-lingual problem decomposition prompting mechanism. Finally, we discuss the challenges associated with the appropriate application of AI-driven language models, along with future directions and limitations in the field of AI for education.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Learning Technologies
IEEE Transactions on Learning Technologies COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-
CiteScore
7.50
自引率
5.40%
发文量
82
审稿时长
>12 weeks
期刊介绍: The IEEE Transactions on Learning Technologies covers all advances in learning technologies and their applications, including but not limited to the following topics: innovative online learning systems; intelligent tutors; educational games; simulation systems for education and training; collaborative learning tools; learning with mobile devices; wearable devices and interfaces for learning; personalized and adaptive learning systems; tools for formative and summative assessment; tools for learning analytics and educational data mining; ontologies for learning systems; standards and web services that support learning; authoring tools for learning materials; computer support for peer tutoring; learning via computer-mediated inquiry, field, and lab work; social learning techniques; social networks and infrastructures for learning and knowledge sharing; and creation and management of learning objects.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信