TADS: Learning Time-Aware Scheduling Policy with Dyna-Style Planning for Spaced Repetition

Zhengyu Yang, Jian Shen, Yunfei Liu, Yang Yang, Weinan Zhang, Yong Yu
{"title":"TADS: Learning Time-Aware Scheduling Policy with Dyna-Style Planning for Spaced Repetition","authors":"Zhengyu Yang, Jian Shen, Yunfei Liu, Yang Yang, Weinan Zhang, Yong Yu","doi":"10.1145/3397271.3401316","DOIUrl":null,"url":null,"abstract":"Spaced repetition technique aims at improving long-term memory retention for human students by exploiting repeated, spaced reviews of learning contents. The study of spaced repetition focuses on designing an optimal policy to schedule the learning contents. To the best of our knowledge, none of the existing methods based on reinforcement learning take into account the varying time intervals between two adjacent learning events of the student, which, however, are essential to determine real-world schedule. In this paper, we aim to learn a scheduling policy that fully exploits the varying time interval information with high sample efficiency. We propose the Time-Aware scheduler with Dyna-Style planning (TADS) approach: a sample-efficient reinforcement learning framework for realistic spaced repetition. TADS learns a Time-LSTM policy to select an optimal content according to the student's whole learning history and the time interval since the last learning event. Besides, Dyna-style planning is integrated into TADS to further improve the sample efficiency. We evaluate our approach on three environments built from synthetic data and real-world data based on well-recognized cognitive models. Empirical results demonstrate that TADS achieves superior performance against state-of-the-art algorithms.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3397271.3401316","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Spaced repetition technique aims at improving long-term memory retention for human students by exploiting repeated, spaced reviews of learning contents. The study of spaced repetition focuses on designing an optimal policy to schedule the learning contents. To the best of our knowledge, none of the existing methods based on reinforcement learning take into account the varying time intervals between two adjacent learning events of the student, which, however, are essential to determine real-world schedule. In this paper, we aim to learn a scheduling policy that fully exploits the varying time interval information with high sample efficiency. We propose the Time-Aware scheduler with Dyna-Style planning (TADS) approach: a sample-efficient reinforcement learning framework for realistic spaced repetition. TADS learns a Time-LSTM policy to select an optimal content according to the student's whole learning history and the time interval since the last learning event. Besides, Dyna-style planning is integrated into TADS to further improve the sample efficiency. We evaluate our approach on three environments built from synthetic data and real-world data based on well-recognized cognitive models. Empirical results demonstrate that TADS achieves superior performance against state-of-the-art algorithms.
TADS:学习时间感知调度策略与动态式计划的间隔重复
间隔重复技术旨在通过对学习内容的重复、间隔复习来提高人类学生的长期记忆力。空间重复学习的研究重点在于设计学习内容的最优调度策略。据我们所知,现有的基于强化学习的方法都没有考虑到学生两个相邻学习事件之间不断变化的时间间隔,然而,这对于确定现实世界的时间表是必不可少的。在本文中,我们的目标是学习一种充分利用变化的时间间隔信息和高样本效率的调度策略。我们提出了具有动态规划(TADS)方法的时间感知调度程序:一种用于现实间隔重复的样本高效强化学习框架。TADS学习一种time - lstm策略,根据学生的整个学习历史和距离上次学习事件的时间间隔来选择最优内容。此外,在TADS中集成了dyna式规划,进一步提高了采样效率。我们在基于公认的认知模型的合成数据和现实世界数据构建的三种环境中评估了我们的方法。实证结果表明,TADS与最先进的算法相比具有优越的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信