{"title":"研究 ChatGPT-3.5 在中国小学教育环境中的辅导效果","authors":"Yu Bai;Jun Li;Jun Shen;Liang Zhao","doi":"10.1109/TLT.2024.3464560","DOIUrl":null,"url":null,"abstract":"The potential of artificial intelligence (AI) in transforming education has received considerable attention. This study aims to explore the potential of large language models (LLMs) in assisting students with studying and passing standardized exams, while many people think it is a hype situation. Using primary education as an example, this research investigates whether ChatGPT-3.5 can achieve satisfactory performance on the Chinese Primary School Exams and whether it can be used as a teaching aid or tutor. We designed an experimental framework and constructed a benchmark that comprises 4800 questions collected from 48 tasks in Chinese elementary education settings. Through automatic and manual evaluations, we observed that ChatGPT-3.5’s pass rate was below the required level of accuracy for most tasks, and the correctness of ChatGPT-3.5’s answer interpretation was unsatisfactory. These results revealed a discrepancy between the findings and our initial expectations. However, the comparative experiments between ChatGPT-3.5 and ChatGPT-4 indicated significant improvements in model performance, demonstrating the potential of using LLMs as a teaching aid. This article also investigates the use of the trans-prompting strategy to reduce the impact of language bias and enhance question understanding. We present a comparison of the models' performance and the improvement under the trans-lingual problem decomposition prompting mechanism. Finally, we discuss the challenges associated with the appropriate application of AI-driven language models, along with future directions and limitations in the field of AI for education.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"17 ","pages":"2156-2171"},"PeriodicalIF":2.9000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Investigating the Efficacy of ChatGPT-3.5 for Tutoring in Chinese Elementary Education Settings\",\"authors\":\"Yu Bai;Jun Li;Jun Shen;Liang Zhao\",\"doi\":\"10.1109/TLT.2024.3464560\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The potential of artificial intelligence (AI) in transforming education has received considerable attention. This study aims to explore the potential of large language models (LLMs) in assisting students with studying and passing standardized exams, while many people think it is a hype situation. Using primary education as an example, this research investigates whether ChatGPT-3.5 can achieve satisfactory performance on the Chinese Primary School Exams and whether it can be used as a teaching aid or tutor. We designed an experimental framework and constructed a benchmark that comprises 4800 questions collected from 48 tasks in Chinese elementary education settings. Through automatic and manual evaluations, we observed that ChatGPT-3.5’s pass rate was below the required level of accuracy for most tasks, and the correctness of ChatGPT-3.5’s answer interpretation was unsatisfactory. These results revealed a discrepancy between the findings and our initial expectations. However, the comparative experiments between ChatGPT-3.5 and ChatGPT-4 indicated significant improvements in model performance, demonstrating the potential of using LLMs as a teaching aid. This article also investigates the use of the trans-prompting strategy to reduce the impact of language bias and enhance question understanding. We present a comparison of the models' performance and the improvement under the trans-lingual problem decomposition prompting mechanism. Finally, we discuss the challenges associated with the appropriate application of AI-driven language models, along with future directions and limitations in the field of AI for education.\",\"PeriodicalId\":49191,\"journal\":{\"name\":\"IEEE Transactions on Learning Technologies\",\"volume\":\"17 \",\"pages\":\"2156-2171\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Learning Technologies\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10684453/\",\"RegionNum\":3,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Learning Technologies","FirstCategoryId":"95","ListUrlMain":"https://ieeexplore.ieee.org/document/10684453/","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Investigating the Efficacy of ChatGPT-3.5 for Tutoring in Chinese Elementary Education Settings
The potential of artificial intelligence (AI) in transforming education has received considerable attention. This study aims to explore the potential of large language models (LLMs) in assisting students with studying and passing standardized exams, while many people think it is a hype situation. Using primary education as an example, this research investigates whether ChatGPT-3.5 can achieve satisfactory performance on the Chinese Primary School Exams and whether it can be used as a teaching aid or tutor. We designed an experimental framework and constructed a benchmark that comprises 4800 questions collected from 48 tasks in Chinese elementary education settings. Through automatic and manual evaluations, we observed that ChatGPT-3.5’s pass rate was below the required level of accuracy for most tasks, and the correctness of ChatGPT-3.5’s answer interpretation was unsatisfactory. These results revealed a discrepancy between the findings and our initial expectations. However, the comparative experiments between ChatGPT-3.5 and ChatGPT-4 indicated significant improvements in model performance, demonstrating the potential of using LLMs as a teaching aid. This article also investigates the use of the trans-prompting strategy to reduce the impact of language bias and enhance question understanding. We present a comparison of the models' performance and the improvement under the trans-lingual problem decomposition prompting mechanism. Finally, we discuss the challenges associated with the appropriate application of AI-driven language models, along with future directions and limitations in the field of AI for education.
期刊介绍:
The IEEE Transactions on Learning Technologies covers all advances in learning technologies and their applications, including but not limited to the following topics: innovative online learning systems; intelligent tutors; educational games; simulation systems for education and training; collaborative learning tools; learning with mobile devices; wearable devices and interfaces for learning; personalized and adaptive learning systems; tools for formative and summative assessment; tools for learning analytics and educational data mining; ontologies for learning systems; standards and web services that support learning; authoring tools for learning materials; computer support for peer tutoring; learning via computer-mediated inquiry, field, and lab work; social learning techniques; social networks and infrastructures for learning and knowledge sharing; and creation and management of learning objects.