Dohee Kim, Unggi Lee, Sookbun Lee, Jiyeong Bae, Taekyung Ahn, Jaekwon Park, Gunho Lee, Hyeoncheol Kim
{"title":"ES-KT-24: A Multimodal Knowledge Tracing Benchmark Dataset with Educational Game Playing Video and Synthetic Text Generation","authors":"Dohee Kim, Unggi Lee, Sookbun Lee, Jiyeong Bae, Taekyung Ahn, Jaekwon Park, Gunho Lee, Hyeoncheol Kim","doi":"arxiv-2409.10244","DOIUrl":null,"url":null,"abstract":"This paper introduces ES-KT-24, a novel multimodal Knowledge Tracing (KT)\ndataset for intelligent tutoring systems in educational game contexts. Although\nKT is crucial in adaptive learning, existing datasets often lack game-based and\nmultimodal elements. ES-KT-24 addresses these limitations by incorporating\neducational game-playing videos, synthetically generated question text, and\ndetailed game logs. The dataset covers Mathematics, English, Indonesian, and\nMalaysian subjects, emphasizing diversity and including non-English content.\nThe synthetic text component, generated using a large language model,\nencompasses 28 distinct knowledge concepts and 182 questions, featuring 15,032\nusers and 7,782,928 interactions. Our benchmark experiments demonstrate the\ndataset's utility for KT research by comparing Deep learning-based KT models\nwith Language Model-based Knowledge Tracing (LKT) approaches. Notably, LKT\nmodels showed slightly higher performance than traditional DKT models,\nhighlighting the potential of language model-based approaches in this field.\nFurthermore, ES-KT-24 has the potential to significantly advance research in\nmultimodal KT models and learning analytics. By integrating game-playing videos\nand detailed game logs, this dataset offers a unique approach to dissecting\nstudent learning patterns through advanced data analysis and machine-learning\ntechniques. It has the potential to unearth new insights into the learning\nprocess and inspire further exploration in the field.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Social and Information Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10244","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper introduces ES-KT-24, a novel multimodal Knowledge Tracing (KT)
dataset for intelligent tutoring systems in educational game contexts. Although
KT is crucial in adaptive learning, existing datasets often lack game-based and
multimodal elements. ES-KT-24 addresses these limitations by incorporating
educational game-playing videos, synthetically generated question text, and
detailed game logs. The dataset covers Mathematics, English, Indonesian, and
Malaysian subjects, emphasizing diversity and including non-English content.
The synthetic text component, generated using a large language model,
encompasses 28 distinct knowledge concepts and 182 questions, featuring 15,032
users and 7,782,928 interactions. Our benchmark experiments demonstrate the
dataset's utility for KT research by comparing Deep learning-based KT models
with Language Model-based Knowledge Tracing (LKT) approaches. Notably, LKT
models showed slightly higher performance than traditional DKT models,
highlighting the potential of language model-based approaches in this field.
Furthermore, ES-KT-24 has the potential to significantly advance research in
multimodal KT models and learning analytics. By integrating game-playing videos
and detailed game logs, this dataset offers a unique approach to dissecting
student learning patterns through advanced data analysis and machine-learning
techniques. It has the potential to unearth new insights into the learning
process and inspire further exploration in the field.