A simple multi-armed nearest-neighbor bandit for interactive recommendation

Proceedings of the 13th ACM Conference on Recommender Systems Pub Date : 2019-09-10 DOI:10.1145/3298689.3347040

Javier Sanz-Cruzado, P. Castells, Esther López

引用次数: 28

Abstract

The cyclic nature of the recommendation task is being increasingly taken into account in recommender systems research. In this line, framing interactive recommendation as a genuine reinforcement learning problem, multi-armed bandit approaches have been increasingly considered as a means to cope with the dual exploitation/exploration goal of recommendation. In this paper we develop a simple multi-armed bandit elaboration of neighbor-based collaborative filtering. The approach can be seen as a variant of the nearest-neighbors scheme, but endowed with a controlled stochastic exploration capability of the users' neighborhood, by a parameter-free application of Thompson sampling. Our approach is based on a formal development and a reasonably simple design, whereby it aims to be easy to reproduce and further elaborate upon. We report experiments using datasets from different domains showing that neighbor-based bandits indeed achieve recommendation accuracy enhancements in the mid to long run.

查看原文本刊更多论文

一个简单的多臂最近邻强盗交互式推荐

在推荐系统的研究中，越来越多地考虑到推荐任务的周期性。在这方面，将交互式推荐作为一个真正的强化学习问题，多武装强盗方法已越来越多地被认为是应对推荐的双重开发/探索目标的一种手段。本文提出了一种简单的多臂强盗邻接协同滤波方法。该方法可以看作是最近邻方案的一种变体，但通过无参数汤普森采样的应用，赋予了用户邻域的可控随机探索能力。我们的方法是基于正式的开发和合理简单的设计，因此它的目标是易于复制和进一步阐述。我们报告了使用来自不同领域的数据集的实验，表明基于邻居的土匪确实在中长期内实现了推荐准确性的增强。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 13th ACM Conference on Recommender Systems

自引率

0.00%

发文量