Structured Projection-free Online Convex Optimization with Multi-point Bandit Feedback

2021 60th IEEE Conference on Decision and Control (CDC) Pub Date : 2021-12-14 DOI:10.1109/CDC45484.2021.9683142

Yuhao Ding, J. Lavaei

{"title":"Structured Projection-free Online Convex Optimization with Multi-point Bandit Feedback","authors":"Yuhao Ding, J. Lavaei","doi":"10.1109/CDC45484.2021.9683142","DOIUrl":null,"url":null,"abstract":"We consider structured online convex optimization (OCO) with bandit feedback, where either the loss function is smooth or the constraint set is strongly convex. Projection-free methods are among the most popular and computationally efficient algorithms for solving this problem, mainly due to their ability to handle convex constraints appearing in machine learning for which computing projections is often impractical in high-dimensional settings. Despite the improved regret bound results for the full-information setting where the gradients of the functions are readily available, it remains unclear whether simple projection-free zero-order algorithms become more efficient for structured OCO problems in the case when multiple function values can be sampled at each time instance. In this paper, we develop some simple projection-free algorithms and prove that they indeed achieve the same improved regret bounds as the full-information case under various additional problem structures. This implies that leveraging the structural properties of the problem compensates for the lack of access to the gradients. Experiments on the online matrix completion reveal several attractive advantages of the proposed algorithms, including their simplicity, easy implementation, and effectiveness, as they outperform other competing algorithms.","PeriodicalId":229089,"journal":{"name":"2021 60th IEEE Conference on Decision and Control (CDC)","volume":"229 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 60th IEEE Conference on Decision and Control (CDC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDC45484.2021.9683142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We consider structured online convex optimization (OCO) with bandit feedback, where either the loss function is smooth or the constraint set is strongly convex. Projection-free methods are among the most popular and computationally efficient algorithms for solving this problem, mainly due to their ability to handle convex constraints appearing in machine learning for which computing projections is often impractical in high-dimensional settings. Despite the improved regret bound results for the full-information setting where the gradients of the functions are readily available, it remains unclear whether simple projection-free zero-order algorithms become more efficient for structured OCO problems in the case when multiple function values can be sampled at each time instance. In this paper, we develop some simple projection-free algorithms and prove that they indeed achieve the same improved regret bounds as the full-information case under various additional problem structures. This implies that leveraging the structural properties of the problem compensates for the lack of access to the gradients. Experiments on the online matrix completion reveal several attractive advantages of the proposed algorithms, including their simplicity, easy implementation, and effectiveness, as they outperform other competing algorithms.

查看原文本刊更多论文

基于多点强盗反馈的结构化无投影在线凸优化

考虑具有强盗反馈的结构化在线凸优化(OCO)，其中损失函数是光滑的或约束集是强凸的。无投影方法是解决该问题的最流行和计算效率最高的算法之一，主要是因为它们能够处理机器学习中出现的凸约束，而在高维环境中计算投影通常是不切实际的。尽管在函数的梯度容易获得的全信息设置中改进了遗憾界结果，但在每个时间实例中可以采样多个函数值的情况下，简单的无投影零阶算法是否对结构化OCO问题更有效仍然不清楚。在本文中，我们开发了一些简单的无投影算法，并证明了它们在各种附加问题结构下确实获得了与全信息情况相同的改进遗憾界。这意味着利用问题的结构属性可以弥补无法访问梯度的不足。对在线矩阵补全的实验揭示了所提出算法的几个吸引人的优点，包括它们的简单性，易于实现和有效性，因为它们优于其他竞争算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 60th IEEE Conference on Decision and Control (CDC)

自引率

0.00%

发文量