Yue Yun, Huan Dai, Rui An, Yupei Zhang, Xuequn Shang
{"title":"Doubly constrained offline reinforcement learning for learning path recommendation","authors":"Yue Yun, Huan Dai, Rui An, Yupei Zhang, Xuequn Shang","doi":"10.1016/j.knosys.2023.111242","DOIUrl":null,"url":null,"abstract":"<div><p>Learning path recommendation refers to the application of interactive recommendation systems in the field of education, aimed at optimizing learning outcomes<span> while minimizing the workload of learners, teachers, and curriculum designers. Reinforcement Learning<span> (RL) has proven effective in capturing and modeling the complex interactions among course activities, learner behaviors, and educational outcomes. Therefore, combining the two approaches presents endless possibilities for personalized education through the use of interactive recommendation systems in the education domain. However, traditional RL algorithms require extensive interaction with the environment during the training phase. Using unverified recommendation logic in interactions with actual students can give rise to unmanageable problems and hinder effective performance in an educational setting. This is because extrapolation introduces substantial evaluation errors that result in recommendations deviating significantly from the actual educational requirements. To address this limitation, we propose a novel method of offline reinforcement learning called Doubly Constrained deep Q-learning Network (DCQN). This method utilizes two generative models to fit existing student historical interaction data, which in turn, constrains the original policy network to generate new actions based on past interactions, avoiding the occurrence of overestimated actions and reducing extrapolation errors. Empirical results on demonstrate that this approach performs better than existing techniques across D4RL, i.e., datasets for deep data-driven reinforcement learning and real educational datasets.</span></span></p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"284 ","pages":"Article 111242"},"PeriodicalIF":7.6000,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705123009917","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Learning path recommendation refers to the application of interactive recommendation systems in the field of education, aimed at optimizing learning outcomes while minimizing the workload of learners, teachers, and curriculum designers. Reinforcement Learning (RL) has proven effective in capturing and modeling the complex interactions among course activities, learner behaviors, and educational outcomes. Therefore, combining the two approaches presents endless possibilities for personalized education through the use of interactive recommendation systems in the education domain. However, traditional RL algorithms require extensive interaction with the environment during the training phase. Using unverified recommendation logic in interactions with actual students can give rise to unmanageable problems and hinder effective performance in an educational setting. This is because extrapolation introduces substantial evaluation errors that result in recommendations deviating significantly from the actual educational requirements. To address this limitation, we propose a novel method of offline reinforcement learning called Doubly Constrained deep Q-learning Network (DCQN). This method utilizes two generative models to fit existing student historical interaction data, which in turn, constrains the original policy network to generate new actions based on past interactions, avoiding the occurrence of overestimated actions and reducing extrapolation errors. Empirical results on demonstrate that this approach performs better than existing techniques across D4RL, i.e., datasets for deep data-driven reinforcement learning and real educational datasets.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.