{"title":"ChatGPT是称职的老师吗?基于胜任力模型的大型语言模型系统评价","authors":"Liuying Gong;Jingyuan Chen;Fei Wu","doi":"10.1109/TLT.2025.3564177","DOIUrl":null,"url":null,"abstract":"The capabilities of large language models (LLMs) in language comprehension, conversational interaction, and content generation have led to their widespread adoption across various educational stages and contexts. Given the fundamental role of education, concerns are rising about whether LLMs can serve as competent teachers. To address the challenge of comprehensively evaluating the competencies of LLMs as teachers, a systematic quantitative evaluation based on the competency model has emerged as a valuable approach. Our study, grounded in the teacher competency model and drawing from 14 existing scales, constructed an evaluation framework called TeacherComp. Based on TeacherComp, we evaluated six LLMs from OpenAI across four dimensions: knowledge, skills, values, and traits. Through comparisons between LLMs’ responses and human norms, we found that: 1) with each successive update, LLMs have shown overall improvements in knowledge, while their skills dimension scores have increasingly aligned with human norms; 2) there are both commonalities and differences in the performance of various LLMs regarding values and traits. For instance, while they all tend to exhibit more negative traits than humans, their morals can vary; and 3) LLMs with reduced security, constructed using jailbreak techniques, exhibit values and traits more closely aligned with human norms. Building on these findings, we provided interpretations and suggestions for the application of LLMs in various educational contexts. Overall, this study helps teachers and students use LLMs in appropriate contexts and provides developers with guidance for future iterations, thereby advancing the role of LLMs in empowering education.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"18 ","pages":"530-541"},"PeriodicalIF":2.9000,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Is ChatGPT a Competent Teacher? Systematic Evaluation of Large Language Models on the Competency Model\",\"authors\":\"Liuying Gong;Jingyuan Chen;Fei Wu\",\"doi\":\"10.1109/TLT.2025.3564177\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The capabilities of large language models (LLMs) in language comprehension, conversational interaction, and content generation have led to their widespread adoption across various educational stages and contexts. Given the fundamental role of education, concerns are rising about whether LLMs can serve as competent teachers. To address the challenge of comprehensively evaluating the competencies of LLMs as teachers, a systematic quantitative evaluation based on the competency model has emerged as a valuable approach. Our study, grounded in the teacher competency model and drawing from 14 existing scales, constructed an evaluation framework called TeacherComp. Based on TeacherComp, we evaluated six LLMs from OpenAI across four dimensions: knowledge, skills, values, and traits. Through comparisons between LLMs’ responses and human norms, we found that: 1) with each successive update, LLMs have shown overall improvements in knowledge, while their skills dimension scores have increasingly aligned with human norms; 2) there are both commonalities and differences in the performance of various LLMs regarding values and traits. For instance, while they all tend to exhibit more negative traits than humans, their morals can vary; and 3) LLMs with reduced security, constructed using jailbreak techniques, exhibit values and traits more closely aligned with human norms. Building on these findings, we provided interpretations and suggestions for the application of LLMs in various educational contexts. Overall, this study helps teachers and students use LLMs in appropriate contexts and provides developers with guidance for future iterations, thereby advancing the role of LLMs in empowering education.\",\"PeriodicalId\":49191,\"journal\":{\"name\":\"IEEE Transactions on Learning Technologies\",\"volume\":\"18 \",\"pages\":\"530-541\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Learning Technologies\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10976353/\",\"RegionNum\":3,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Learning Technologies","FirstCategoryId":"95","ListUrlMain":"https://ieeexplore.ieee.org/document/10976353/","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Is ChatGPT a Competent Teacher? Systematic Evaluation of Large Language Models on the Competency Model
The capabilities of large language models (LLMs) in language comprehension, conversational interaction, and content generation have led to their widespread adoption across various educational stages and contexts. Given the fundamental role of education, concerns are rising about whether LLMs can serve as competent teachers. To address the challenge of comprehensively evaluating the competencies of LLMs as teachers, a systematic quantitative evaluation based on the competency model has emerged as a valuable approach. Our study, grounded in the teacher competency model and drawing from 14 existing scales, constructed an evaluation framework called TeacherComp. Based on TeacherComp, we evaluated six LLMs from OpenAI across four dimensions: knowledge, skills, values, and traits. Through comparisons between LLMs’ responses and human norms, we found that: 1) with each successive update, LLMs have shown overall improvements in knowledge, while their skills dimension scores have increasingly aligned with human norms; 2) there are both commonalities and differences in the performance of various LLMs regarding values and traits. For instance, while they all tend to exhibit more negative traits than humans, their morals can vary; and 3) LLMs with reduced security, constructed using jailbreak techniques, exhibit values and traits more closely aligned with human norms. Building on these findings, we provided interpretations and suggestions for the application of LLMs in various educational contexts. Overall, this study helps teachers and students use LLMs in appropriate contexts and provides developers with guidance for future iterations, thereby advancing the role of LLMs in empowering education.
期刊介绍:
The IEEE Transactions on Learning Technologies covers all advances in learning technologies and their applications, including but not limited to the following topics: innovative online learning systems; intelligent tutors; educational games; simulation systems for education and training; collaborative learning tools; learning with mobile devices; wearable devices and interfaces for learning; personalized and adaptive learning systems; tools for formative and summative assessment; tools for learning analytics and educational data mining; ontologies for learning systems; standards and web services that support learning; authoring tools for learning materials; computer support for peer tutoring; learning via computer-mediated inquiry, field, and lab work; social learning techniques; social networks and infrastructures for learning and knowledge sharing; and creation and management of learning objects.