A Reinforcement Learning and Recurrent Neural Network Based Dynamic User Modeling System

Abhishek Tripathi, S. AshwinT., R. R. Guddeti
{"title":"A Reinforcement Learning and Recurrent Neural Network Based Dynamic User Modeling System","authors":"Abhishek Tripathi, S. AshwinT., R. R. Guddeti","doi":"10.1109/ICALT.2018.00103","DOIUrl":null,"url":null,"abstract":"With the exponential growth in areas of machine intelligence, the world has witnessed promising solutions to the personalized content recommendation. The ability of interactive learning agents to take optimal decisions in dynamic environments has been very well conceptualized and proven by Reinforcement Learning (RL). The learning characteristics of Deep-Bidirectional Recurrent Neural Networks (DBRNN) in both positive and negative time directions has shown exceptional performance as generative models to generate sequential data in supervised learning tasks. In this paper, we harness the potential of the said two techniques and strive to create personalized video recommendation through emotional intelligence by presenting a novel context-aware collaborative filtering approach where intensity of users' spontaneous non-verbal emotional response towards recommended video is captured through system-interactions and facial expression analysis for decision-making and video corpus evolution with real-time data streams. We take into account a user's dynamic nature in the formulation of optimal policies, by framing up an RL-scenario with an off-policy (Q-Learning) algorithm for temporal-difference learning, which is used to train DBRNN to learn contextual patterns and generate new video sequences for the recommendation. Evaluation of our system with real users for a month shows that our approach outperforms state-of-the-art methods and models a user's emotional preferences very well with stable convergence.","PeriodicalId":361110,"journal":{"name":"2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICALT.2018.00103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

With the exponential growth in areas of machine intelligence, the world has witnessed promising solutions to the personalized content recommendation. The ability of interactive learning agents to take optimal decisions in dynamic environments has been very well conceptualized and proven by Reinforcement Learning (RL). The learning characteristics of Deep-Bidirectional Recurrent Neural Networks (DBRNN) in both positive and negative time directions has shown exceptional performance as generative models to generate sequential data in supervised learning tasks. In this paper, we harness the potential of the said two techniques and strive to create personalized video recommendation through emotional intelligence by presenting a novel context-aware collaborative filtering approach where intensity of users' spontaneous non-verbal emotional response towards recommended video is captured through system-interactions and facial expression analysis for decision-making and video corpus evolution with real-time data streams. We take into account a user's dynamic nature in the formulation of optimal policies, by framing up an RL-scenario with an off-policy (Q-Learning) algorithm for temporal-difference learning, which is used to train DBRNN to learn contextual patterns and generate new video sequences for the recommendation. Evaluation of our system with real users for a month shows that our approach outperforms state-of-the-art methods and models a user's emotional preferences very well with stable convergence.
基于强化学习和递归神经网络的动态用户建模系统
随着机器智能领域的指数级增长,世界见证了个性化内容推荐的有前途的解决方案。交互式学习代理在动态环境中做出最优决策的能力已经被强化学习(RL)很好地概念化和证明。深度双向递归神经网络(Deep-Bidirectional Recurrent Neural Networks, DBRNN)在正负时间方向上的学习特性在监督学习任务中作为生成模型生成序列数据方面表现出优异的性能。在本文中,我们利用上述两种技术的潜力,并通过提出一种新颖的上下文感知协同过滤方法,努力通过情绪智能创建个性化视频推荐,该方法通过系统交互和面部表情分析捕获用户对推荐视频的自发非语言情绪反应强度,用于决策和实时数据流的视频语料库进化。我们在制定最优策略时考虑了用户的动态特性,通过构建一个带有非策略(Q-Learning)算法的rl场景,用于时间差学习,该算法用于训练DBRNN学习上下文模式并为推荐生成新的视频序列。用真实用户对我们的系统进行了一个月的评估,结果表明我们的方法优于最先进的方法,并且可以很好地模拟用户的情感偏好,具有稳定的收敛性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信