Efficient and Optimal Fixed-Time Regret with Two Experts

International Conference on Algorithmic Learning Theory Pub Date : 2022-03-15 DOI:10.48550/arXiv.2203.07577

L. Greenstreet, Nicholas J. A. Harvey, Victor S. Portella

引用次数: 3

Abstract

Prediction with expert advice is a foundational problem in online learning. In instances with $T$ rounds and $n$ experts, the classical Multiplicative Weights Update method suffers at most $\sqrt{(T/2)\ln n}$ regret when $T$ is known beforehand. Moreover, this is asymptotically optimal when both $T$ and $n$ grow to infinity. However, when the number of experts $n$ is small/fixed, algorithms with better regret guarantees exist. Cover showed in 1967 a dynamic programming algorithm for the two-experts problem restricted to $\{0,1\}$ costs that suffers at most $\sqrt{T/2\pi} + O(1)$ regret with $O(T^2)$ pre-processing time. In this work, we propose an optimal algorithm for prediction with two experts' advice that works even for costs in $[0,1]$ and with $O(1)$ processing time per turn. Our algorithm builds up on recent work on the experts problem based on techniques and tools from stochastic calculus.

查看原文本刊更多论文

两个专家的有效和最优的固定时间后悔

基于专家建议的预测是在线学习的一个基本问题。在具有$T$轮数和$n$专家的实例中，当$T$事先已知时，经典的乘法权值更新方法最多只能产生$\sqrt{(T/2)\ln n}$遗憾。而且，当$T$和$n$都趋于无穷时，这是渐近最优的。然而，当专家的数量$n$很小或固定时，存在更好的后悔保证算法。Cover在1967年提出了一种动态规划算法，用于限制$\{0,1\}$成本的双专家问题，该算法在$O(T^2)$预处理时间内最多遭受$\sqrt{T/2\pi} + O(1)$遗憾。在这项工作中，我们提出了一种基于两位专家建议的最优预测算法，该算法甚至适用于$[0,1]$的成本和$O(1)$的每轮处理时间。我们的算法建立在基于随机微积分技术和工具的专家问题的最新研究之上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Algorithmic Learning Theory

自引率

0.00%

发文量