TADS: Learning Time-Aware Scheduling Policy with Dyna-Style Planning for Spaced Repetition

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2020-07-25 DOI:10.1145/3397271.3401316

Zhengyu Yang, Jian Shen, Yunfei Liu, Yang Yang, Weinan Zhang, Yong Yu

{"title":"TADS: Learning Time-Aware Scheduling Policy with Dyna-Style Planning for Spaced Repetition","authors":"Zhengyu Yang, Jian Shen, Yunfei Liu, Yang Yang, Weinan Zhang, Yong Yu","doi":"10.1145/3397271.3401316","DOIUrl":null,"url":null,"abstract":"Spaced repetition technique aims at improving long-term memory retention for human students by exploiting repeated, spaced reviews of learning contents. The study of spaced repetition focuses on designing an optimal policy to schedule the learning contents. To the best of our knowledge, none of the existing methods based on reinforcement learning take into account the varying time intervals between two adjacent learning events of the student, which, however, are essential to determine real-world schedule. In this paper, we aim to learn a scheduling policy that fully exploits the varying time interval information with high sample efficiency. We propose the Time-Aware scheduler with Dyna-Style planning (TADS) approach: a sample-efficient reinforcement learning framework for realistic spaced repetition. TADS learns a Time-LSTM policy to select an optimal content according to the student's whole learning history and the time interval since the last learning event. Besides, Dyna-style planning is integrated into TADS to further improve the sample efficiency. We evaluate our approach on three environments built from synthetic data and real-world data based on well-recognized cognitive models. Empirical results demonstrate that TADS achieves superior performance against state-of-the-art algorithms.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3397271.3401316","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Spaced repetition technique aims at improving long-term memory retention for human students by exploiting repeated, spaced reviews of learning contents. The study of spaced repetition focuses on designing an optimal policy to schedule the learning contents. To the best of our knowledge, none of the existing methods based on reinforcement learning take into account the varying time intervals between two adjacent learning events of the student, which, however, are essential to determine real-world schedule. In this paper, we aim to learn a scheduling policy that fully exploits the varying time interval information with high sample efficiency. We propose the Time-Aware scheduler with Dyna-Style planning (TADS) approach: a sample-efficient reinforcement learning framework for realistic spaced repetition. TADS learns a Time-LSTM policy to select an optimal content according to the student's whole learning history and the time interval since the last learning event. Besides, Dyna-style planning is integrated into TADS to further improve the sample efficiency. We evaluate our approach on three environments built from synthetic data and real-world data based on well-recognized cognitive models. Empirical results demonstrate that TADS achieves superior performance against state-of-the-art algorithms.

查看原文本刊更多论文

TADS:学习时间感知调度策略与动态式计划的间隔重复

间隔重复技术旨在通过对学习内容的重复、间隔复习来提高人类学生的长期记忆力。空间重复学习的研究重点在于设计学习内容的最优调度策略。据我们所知，现有的基于强化学习的方法都没有考虑到学生两个相邻学习事件之间不断变化的时间间隔，然而，这对于确定现实世界的时间表是必不可少的。在本文中，我们的目标是学习一种充分利用变化的时间间隔信息和高样本效率的调度策略。我们提出了具有动态规划(TADS)方法的时间感知调度程序:一种用于现实间隔重复的样本高效强化学习框架。TADS学习一种time - lstm策略，根据学生的整个学习历史和距离上次学习事件的时间间隔来选择最优内容。此外，在TADS中集成了dyna式规划，进一步提高了采样效率。我们在基于公认的认知模型的合成数据和现实世界数据构建的三种环境中评估了我们的方法。实证结果表明，TADS与最先进的算法相比具有优越的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

自引率

0.00%

发文量