{"title":"TADS:学习时间感知调度策略与动态式计划的间隔重复","authors":"Zhengyu Yang, Jian Shen, Yunfei Liu, Yang Yang, Weinan Zhang, Yong Yu","doi":"10.1145/3397271.3401316","DOIUrl":null,"url":null,"abstract":"Spaced repetition technique aims at improving long-term memory retention for human students by exploiting repeated, spaced reviews of learning contents. The study of spaced repetition focuses on designing an optimal policy to schedule the learning contents. To the best of our knowledge, none of the existing methods based on reinforcement learning take into account the varying time intervals between two adjacent learning events of the student, which, however, are essential to determine real-world schedule. In this paper, we aim to learn a scheduling policy that fully exploits the varying time interval information with high sample efficiency. We propose the Time-Aware scheduler with Dyna-Style planning (TADS) approach: a sample-efficient reinforcement learning framework for realistic spaced repetition. TADS learns a Time-LSTM policy to select an optimal content according to the student's whole learning history and the time interval since the last learning event. Besides, Dyna-style planning is integrated into TADS to further improve the sample efficiency. We evaluate our approach on three environments built from synthetic data and real-world data based on well-recognized cognitive models. Empirical results demonstrate that TADS achieves superior performance against state-of-the-art algorithms.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"TADS: Learning Time-Aware Scheduling Policy with Dyna-Style Planning for Spaced Repetition\",\"authors\":\"Zhengyu Yang, Jian Shen, Yunfei Liu, Yang Yang, Weinan Zhang, Yong Yu\",\"doi\":\"10.1145/3397271.3401316\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spaced repetition technique aims at improving long-term memory retention for human students by exploiting repeated, spaced reviews of learning contents. The study of spaced repetition focuses on designing an optimal policy to schedule the learning contents. To the best of our knowledge, none of the existing methods based on reinforcement learning take into account the varying time intervals between two adjacent learning events of the student, which, however, are essential to determine real-world schedule. In this paper, we aim to learn a scheduling policy that fully exploits the varying time interval information with high sample efficiency. We propose the Time-Aware scheduler with Dyna-Style planning (TADS) approach: a sample-efficient reinforcement learning framework for realistic spaced repetition. TADS learns a Time-LSTM policy to select an optimal content according to the student's whole learning history and the time interval since the last learning event. Besides, Dyna-style planning is integrated into TADS to further improve the sample efficiency. We evaluate our approach on three environments built from synthetic data and real-world data based on well-recognized cognitive models. Empirical results demonstrate that TADS achieves superior performance against state-of-the-art algorithms.\",\"PeriodicalId\":252050,\"journal\":{\"name\":\"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3397271.3401316\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3397271.3401316","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
TADS: Learning Time-Aware Scheduling Policy with Dyna-Style Planning for Spaced Repetition
Spaced repetition technique aims at improving long-term memory retention for human students by exploiting repeated, spaced reviews of learning contents. The study of spaced repetition focuses on designing an optimal policy to schedule the learning contents. To the best of our knowledge, none of the existing methods based on reinforcement learning take into account the varying time intervals between two adjacent learning events of the student, which, however, are essential to determine real-world schedule. In this paper, we aim to learn a scheduling policy that fully exploits the varying time interval information with high sample efficiency. We propose the Time-Aware scheduler with Dyna-Style planning (TADS) approach: a sample-efficient reinforcement learning framework for realistic spaced repetition. TADS learns a Time-LSTM policy to select an optimal content according to the student's whole learning history and the time interval since the last learning event. Besides, Dyna-style planning is integrated into TADS to further improve the sample efficiency. We evaluate our approach on three environments built from synthetic data and real-world data based on well-recognized cognitive models. Empirical results demonstrate that TADS achieves superior performance against state-of-the-art algorithms.