Using ChatGPT to annotate a dataset: A case study in intelligent tutoring systems

Aleksandar Vujinović, Nikola Luburić, Jelena Slivka, Aleksandar Kovačević
{"title":"Using ChatGPT to annotate a dataset: A case study in intelligent tutoring systems","authors":"Aleksandar Vujinović,&nbsp;Nikola Luburić,&nbsp;Jelena Slivka,&nbsp;Aleksandar Kovačević","doi":"10.1016/j.mlwa.2024.100557","DOIUrl":null,"url":null,"abstract":"<div><p>Large language models like ChatGPT can learn in-context (ICL) from examples. Studies showed that, due to ICL, ChatGPT achieves impressive performance in various natural language processing tasks. However, to the best of our knowledge, this is the first study that assesses ChatGPT's effectiveness in annotating a dataset for training instructor models in intelligent tutoring systems (ITSs). The task of an ITS instructor model is to automatically provide effective tutoring instruction given a student's state, mimicking human instructors. These models are typically implemented as hardcoded rules, requiring expertise, and limiting their ability to generalize and personalize instructions. These problems could be mitigated by utilizing machine learning (ML). However, developing ML models requires a large dataset of student states annotated by corresponding tutoring instructions. Using human experts to annotate such a dataset is expensive, time-consuming, and requires pedagogical expertise. Thus, this study explores ChatGPT's potential to act as a pedagogy expert annotator. Using prompt engineering, we created a list of instructions a tutor could recommend to a student. We manually filtered this list and instructed ChatGPT to select the appropriate instruction from the list for the given student's state. We manually analyzed ChatGPT's responses that could be considered incorrectly annotated. Our results indicate that using ChatGPT as an annotator is an effective alternative to human experts. The contributions of our work are (1) a novel dataset annotation methodology for the ITS, (2) a publicly available dataset of student states annotated with tutoring instructions, and (3) a list of possible tutoring instructions.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"16 ","pages":"Article 100557"},"PeriodicalIF":0.0000,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000331/pdfft?md5=3322a1226bc15e9303a8f45ef791c421&pid=1-s2.0-S2666827024000331-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning with applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666827024000331","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Large language models like ChatGPT can learn in-context (ICL) from examples. Studies showed that, due to ICL, ChatGPT achieves impressive performance in various natural language processing tasks. However, to the best of our knowledge, this is the first study that assesses ChatGPT's effectiveness in annotating a dataset for training instructor models in intelligent tutoring systems (ITSs). The task of an ITS instructor model is to automatically provide effective tutoring instruction given a student's state, mimicking human instructors. These models are typically implemented as hardcoded rules, requiring expertise, and limiting their ability to generalize and personalize instructions. These problems could be mitigated by utilizing machine learning (ML). However, developing ML models requires a large dataset of student states annotated by corresponding tutoring instructions. Using human experts to annotate such a dataset is expensive, time-consuming, and requires pedagogical expertise. Thus, this study explores ChatGPT's potential to act as a pedagogy expert annotator. Using prompt engineering, we created a list of instructions a tutor could recommend to a student. We manually filtered this list and instructed ChatGPT to select the appropriate instruction from the list for the given student's state. We manually analyzed ChatGPT's responses that could be considered incorrectly annotated. Our results indicate that using ChatGPT as an annotator is an effective alternative to human experts. The contributions of our work are (1) a novel dataset annotation methodology for the ITS, (2) a publicly available dataset of student states annotated with tutoring instructions, and (3) a list of possible tutoring instructions.

使用 ChatGPT 对数据集进行注释:智能辅导系统案例研究
像 ChatGPT 这样的大型语言模型可以从示例中学习上下文(ICL)。研究表明,由于有了 ICL,ChatGPT 在各种自然语言处理任务中都取得了令人瞩目的成绩。然而,据我们所知,这是第一项评估 ChatGPT 在为智能辅导系统(ITS)中的教师模型训练数据集注释时的有效性的研究。智能辅导系统教师模型的任务是模仿人类教师,根据学生的状态自动提供有效的辅导指导。这些模型通常是以硬编码规则的形式实现的,需要专业知识,而且限制了其概括和个性化指导的能力。利用机器学习(ML)可以缓解这些问题。然而,开发 ML 模型需要一个由相应辅导指令注释的大型学生状态数据集。使用人类专家来注释这样一个数据集既昂贵又耗时,而且还需要教学方面的专业知识。因此,本研究探索了 ChatGPT 作为教学法专家注释器的潜力。通过使用提示工程,我们创建了一份导师可向学生推荐的说明列表。我们手动筛选了这个列表,并指示 ChatGPT 从列表中为给定的学生状态选择合适的指令。我们手动分析了 ChatGPT 的回复中可能存在的错误注释。我们的结果表明,使用 ChatGPT 作为注释器可以有效替代人类专家。我们工作的贡献在于:(1) 为智能学习系统提供了一种新颖的数据集注释方法;(2) 提供了一个公开的学生状态数据集,其中注释了辅导说明;(3) 提供了一个可能的辅导说明列表。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Machine learning with applications
Machine learning with applications Management Science and Operations Research, Artificial Intelligence, Computer Science Applications
自引率
0.00%
发文量
0
审稿时长
98 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信