Jason Bohne, David Rosenberg, Gary Kazantsev, Pawel Polak
{"title":"Online Nonconvex Bilevel Optimization with Bregman Divergences","authors":"Jason Bohne, David Rosenberg, Gary Kazantsev, Pawel Polak","doi":"arxiv-2409.10470","DOIUrl":null,"url":null,"abstract":"Bilevel optimization methods are increasingly relevant within machine\nlearning, especially for tasks such as hyperparameter optimization and\nmeta-learning. Compared to the offline setting, online bilevel optimization\n(OBO) offers a more dynamic framework by accommodating time-varying functions\nand sequentially arriving data. This study addresses the online\nnonconvex-strongly convex bilevel optimization problem. In deterministic\nsettings, we introduce a novel online Bregman bilevel optimizer (OBBO) that\nutilizes adaptive Bregman divergences. We demonstrate that OBBO enhances the\nknown sublinear rates for bilevel local regret through a novel hypergradient\nerror decomposition that adapts to the underlying geometry of the problem. In\nstochastic contexts, we introduce the first stochastic online bilevel optimizer\n(SOBBO), which employs a window averaging method for updating outer-level\nvariables using a weighted average of recent stochastic approximations of\nhypergradients. This approach not only achieves sublinear rates of bilevel\nlocal regret but also serves as an effective variance reduction strategy,\nobviating the need for additional stochastic gradient samples at each timestep.\nExperiments on online hyperparameter optimization and online meta-learning\nhighlight the superior performance, efficiency, and adaptability of our\nBregman-based algorithms compared to established online and offline bilevel\nbenchmarks.","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":"201 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Optimization and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10470","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Bilevel optimization methods are increasingly relevant within machine
learning, especially for tasks such as hyperparameter optimization and
meta-learning. Compared to the offline setting, online bilevel optimization
(OBO) offers a more dynamic framework by accommodating time-varying functions
and sequentially arriving data. This study addresses the online
nonconvex-strongly convex bilevel optimization problem. In deterministic
settings, we introduce a novel online Bregman bilevel optimizer (OBBO) that
utilizes adaptive Bregman divergences. We demonstrate that OBBO enhances the
known sublinear rates for bilevel local regret through a novel hypergradient
error decomposition that adapts to the underlying geometry of the problem. In
stochastic contexts, we introduce the first stochastic online bilevel optimizer
(SOBBO), which employs a window averaging method for updating outer-level
variables using a weighted average of recent stochastic approximations of
hypergradients. This approach not only achieves sublinear rates of bilevel
local regret but also serves as an effective variance reduction strategy,
obviating the need for additional stochastic gradient samples at each timestep.
Experiments on online hyperparameter optimization and online meta-learning
highlight the superior performance, efficiency, and adaptability of our
Bregman-based algorithms compared to established online and offline bilevel
benchmarks.