On Acceleration of Gradient-Based Empirical Risk Minimization using Local Polynomial Regression.

Control Conference (ECC) ... European. European Control Conference Pub Date : 2022-07-01 Epub Date: 2022-08-05 DOI:10.23919/ecc55457.2022.9838261

Ekaterina Trimbach, Edward Duc Hien Nguyen, César A Uribe

{"title":"On Acceleration of Gradient-Based Empirical Risk Minimization using Local Polynomial Regression.","authors":"Ekaterina Trimbach, Edward Duc Hien Nguyen, César A Uribe","doi":"10.23919/ecc55457.2022.9838261","DOIUrl":null,"url":null,"abstract":"We study the acceleration of the Local Polynomial Interpolation-based Gradient Descent method (LPI-GD) recently proposed for the approximate solution of empirical risk minimization problems (ERM). We focus on loss functions that are strongly convex and smooth with condition number σ. We additionally assume the loss function is η-Hölder continuous with respect to the data. The oracle complexity of LPI-GD is <math> <mrow><mover><mi>O</mi> <mo>˜</mo></mover> <mrow><mo>(</mo> <mrow><mi>σ</mi> <msup><mi>m</mi> <mi>d</mi></msup> <mspace></mspace> <mtext>log</mtext> <mo>(</mo> <mn>1</mn> <mo>/</mo> <mi>ε</mi> <mo>)</mo></mrow> <mo>)</mo></mrow> </mrow> </math> for a desired accuracy ε, where d is the dimension of the parameter space, and m is the cardinality of an approximation grid. The factor m d can be shown to scale as O((1/ε) d/2η ). LPI-GD has been shown to have better oracle complexity than gradient descent (GD) and stochastic gradient descent (SGD) for certain parameter regimes. We propose two accelerated methods for the ERM problem based on LPI-GD and show an oracle complexity of <math> <mrow><mover><mi>O</mi> <mo>˜</mo></mover> <mrow><mo>(</mo> <mrow><msqrt><mi>σ</mi></msqrt> <msup><mi>m</mi> <mi>d</mi></msup> <mspace></mspace> <mtext>log</mtext> <mo>(</mo> <mn>1</mn> <mo>/</mo> <mi>ε</mi> <mo>)</mo></mrow> <mo>)</mo></mrow> </mrow> </math> . Moreover, we provide the first empirical study on local polynomial interpolation-based gradient methods and corroborate that LPI-GD has better performance than GD and SGD in some scenarios, and the proposed methods achieve acceleration.","PeriodicalId":72704,"journal":{"name":"Control Conference (ECC) ... European. European Control Conference","volume":"2022 ","pages":"429-434"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9581727/pdf/nihms-1842409.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Control Conference (ECC) ... European. European Control Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ecc55457.2022.9838261","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/8/5 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We study the acceleration of the Local Polynomial Interpolation-based Gradient Descent method (LPI-GD) recently proposed for the approximate solution of empirical risk minimization problems (ERM). We focus on loss functions that are strongly convex and smooth with condition number σ. We additionally assume the loss function is η-Hölder continuous with respect to the data. The oracle complexity of LPI-GD is $\tilde{O} (σ m^{d} log (1 / ε))$ for a desired accuracy ε, where d is the dimension of the parameter space, and m is the cardinality of an approximation grid. The factor m ^d can be shown to scale as O((1/ε) ^d/2η ). LPI-GD has been shown to have better oracle complexity than gradient descent (GD) and stochastic gradient descent (SGD) for certain parameter regimes. We propose two accelerated methods for the ERM problem based on LPI-GD and show an oracle complexity of $\tilde{O} (\sqrt{σ} m^{d} log (1 / ε))$ . Moreover, we provide the first empirical study on local polynomial interpolation-based gradient methods and corroborate that LPI-GD has better performance than GD and SGD in some scenarios, and the proposed methods achieve acceleration.

Abstract Image

查看原文本刊更多论文

利用局部多项式回归加速基于梯度的经验风险最小化。

我们研究了最近为近似解决经验风险最小化问题（ERM）而提出的基于局部多项式插值的梯度下降法（LPI-GD）的加速问题。我们将重点放在条件数为 σ 的强凸平滑损失函数上。此外，我们还假设损失函数相对于数据是 η-Hölder 连续的。对于期望精度 ε，LPI-GD 的甲骨文复杂度为 O ˜ ( σ m d log ( 1 / ε ) ) ，其中 d 是参数空间的维数，m 是近似网格的最小值。m d 因子可按 O((1/ε) d/2η ) 的比例缩放。在某些参数条件下，LPI-GD 比梯度下降法（GD）和随机梯度下降法（SGD）具有更好的算法复杂性。我们提出了两种基于 LPI-GD 的 ERM 问题加速方法，结果表明其算法复杂度为 O ˜ ( σ m d log ( 1 / ε ) ) 。此外，我们首次对基于局部多项式插值的梯度方法进行了实证研究，证实了 LPI-GD 在某些情况下比 GD 和 SGD 具有更好的性能，而且所提出的方法实现了加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Control Conference (ECC) ... European. European Control Conference

自引率

0.00%

发文量