{"title":"Deep Learning Based Knowledge Tracing: A Review, a Tool and Empirical Studies","authors":"Zitao Liu;Teng Guo;Qianru Liang;Mingliang Hou;Bojun Zhan;Jiliang Tang;Weiqi Luo;Jian Weng","doi":"10.1109/TKDE.2025.3552759","DOIUrl":null,"url":null,"abstract":"Knowledge tracing (KT) involves utilizing historical data from students’ learning interactions to model their mastery of knowledge over time, with the aim of predicting their future performance in interactions. Recently, significant advancements have been achieved through the application of various deep learning methodologies to address the KT challenge. However, a considerable proportion of deep learning-based knowledge tracing (DLKT) approaches exhibit striking similarities in their methodologies, and model designs, and even the outcomes demonstrate minimal divergence. In addition, the evaluation procedures employed in current DLKT studies are not standardized, resulting in substantial inconsistencies in the reported area under the curve (AUC) outcomes, despite analyzing the same model on identical datasets. To address the two aforementioned problems, this paper proposes a generalized DLKT framework and represents the existing DLKT models with five components, i.e., multimodal data encoder, student knowledge memory, auxiliary knowledge base, learning outcome objective, and computational efficiency and scalability. Furthermore, we develop and open source a standardized DLKT benchmark platform named <sc>pyKT</small>,<sup>1</sup> that consists of a standardized set of integrated data preprocessing procedures on 9 popular datasets across different domains, and 21 frequently compared DLKT model implementations. With <sc>pyKT</small>, we conduct empirical and reproducible research to assess the performance of prevalent DLKT algorithms in an unbiased and clear setting over multiple data sources. Finally, we discuss the applications of KT techniques in the educational sector and their future development directions.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 8","pages":"4512-4536"},"PeriodicalIF":10.4000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10933562/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Knowledge tracing (KT) involves utilizing historical data from students’ learning interactions to model their mastery of knowledge over time, with the aim of predicting their future performance in interactions. Recently, significant advancements have been achieved through the application of various deep learning methodologies to address the KT challenge. However, a considerable proportion of deep learning-based knowledge tracing (DLKT) approaches exhibit striking similarities in their methodologies, and model designs, and even the outcomes demonstrate minimal divergence. In addition, the evaluation procedures employed in current DLKT studies are not standardized, resulting in substantial inconsistencies in the reported area under the curve (AUC) outcomes, despite analyzing the same model on identical datasets. To address the two aforementioned problems, this paper proposes a generalized DLKT framework and represents the existing DLKT models with five components, i.e., multimodal data encoder, student knowledge memory, auxiliary knowledge base, learning outcome objective, and computational efficiency and scalability. Furthermore, we develop and open source a standardized DLKT benchmark platform named pyKT,1 that consists of a standardized set of integrated data preprocessing procedures on 9 popular datasets across different domains, and 21 frequently compared DLKT model implementations. With pyKT, we conduct empirical and reproducible research to assess the performance of prevalent DLKT algorithms in an unbiased and clear setting over multiple data sources. Finally, we discuss the applications of KT techniques in the educational sector and their future development directions.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.