Reinforcement Learning: the Sooner the Better, or the Later the Better?

Shitian Shen, Min Chi
{"title":"Reinforcement Learning: the Sooner the Better, or the Later the Better?","authors":"Shitian Shen, Min Chi","doi":"10.1145/2930238.2930247","DOIUrl":null,"url":null,"abstract":"Reinforcement Learning (RL) is one of the best machine learning approaches for decision making in interactive environments. RL focuses on inducing effective decision making policies with the goal of maximizing the agent's cumulative reward. In this study, we investigated the impact of both immediate and delayed reward functions on RL-induced policies and empirically evaluated the effectiveness of induced policies within an Intelligent Tutoring System called Deep Thought. Moreover, we divided students into Fast and Slow learners based on their incoming competence as measured by their average response time on the initial tutorial level. Our results show that there was a significant interaction effect between the induced policies and the students' incoming competence. More specifically, Fast learners are less sensitive to learning environments in that they can learn equally well regardless of the pedagogical strategies employed by the tutor, but Slow learners benefit significantly more from effective pedagogical strategies than from ineffective ones. In fact, with effective pedagogical strategies the slow learners learned as much as their faster peers, but with ineffective pedagogical strategies the former learned significantly less than the latter.","PeriodicalId":339100,"journal":{"name":"Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2930238.2930247","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28

Abstract

Reinforcement Learning (RL) is one of the best machine learning approaches for decision making in interactive environments. RL focuses on inducing effective decision making policies with the goal of maximizing the agent's cumulative reward. In this study, we investigated the impact of both immediate and delayed reward functions on RL-induced policies and empirically evaluated the effectiveness of induced policies within an Intelligent Tutoring System called Deep Thought. Moreover, we divided students into Fast and Slow learners based on their incoming competence as measured by their average response time on the initial tutorial level. Our results show that there was a significant interaction effect between the induced policies and the students' incoming competence. More specifically, Fast learners are less sensitive to learning environments in that they can learn equally well regardless of the pedagogical strategies employed by the tutor, but Slow learners benefit significantly more from effective pedagogical strategies than from ineffective ones. In fact, with effective pedagogical strategies the slow learners learned as much as their faster peers, but with ineffective pedagogical strategies the former learned significantly less than the latter.
强化学习:越快越好,还是越晚越好?
强化学习(RL)是在交互式环境中进行决策的最佳机器学习方法之一。强化学习侧重于诱导有效的决策策略,目标是最大化代理的累积奖励。在本研究中,我们研究了即时和延迟奖励函数对强化学习诱导政策的影响,并在一个名为Deep Thought的智能辅导系统中实证评估了诱导政策的有效性。此外,我们根据学生的入门能力将他们分为快速学习者和慢速学习者,这是通过他们在初始教程水平上的平均反应时间来衡量的。结果表明,诱导政策与学生入职能力之间存在显著的交互效应。更具体地说,快速学习者对学习环境不太敏感,因为无论导师采用何种教学策略,他们都能学得同样好,但慢学习者从有效的教学策略中获益明显多于从无效的教学策略中获益。事实上,在有效的教学策略下,慢学习者学得和快学习者一样多,但在无效的教学策略下,前者学得明显比后者少。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信