On Acceleration of Gradient-Based Empirical Risk Minimization using Local Polynomial Regression.

Ekaterina Trimbach, Edward Duc Hien Nguyen, César A Uribe
{"title":"On Acceleration of Gradient-Based Empirical Risk Minimization using Local Polynomial Regression.","authors":"Ekaterina Trimbach, Edward Duc Hien Nguyen, César A Uribe","doi":"10.23919/ecc55457.2022.9838261","DOIUrl":null,"url":null,"abstract":"<p><p>We study the acceleration of the Local Polynomial Interpolation-based Gradient Descent method (LPI-GD) recently proposed for the approximate solution of empirical risk minimization problems (ERM). We focus on loss functions that are strongly convex and smooth with condition number <i>σ</i>. We additionally assume the loss function is <i>η</i>-Hölder continuous with respect to the data. The oracle complexity of LPI-GD is <math> <mrow><mover><mi>O</mi> <mo>˜</mo></mover> <mrow><mo>(</mo> <mrow><mi>σ</mi> <msup><mi>m</mi> <mi>d</mi></msup> <mspace></mspace> <mtext>log</mtext> <mo>(</mo> <mn>1</mn> <mo>/</mo> <mi>ε</mi> <mo>)</mo></mrow> <mo>)</mo></mrow> </mrow> </math> for a desired accuracy <i>ε</i>, where <i>d</i> is the dimension of the parameter space, and <i>m</i> is the cardinality of an approximation grid. The factor <i>m</i> <sup><i>d</i></sup> can be shown to scale as <i>O</i>((1/<i>ε</i>) <sup><i>d</i>/2<i>η</i></sup> ). LPI-GD has been shown to have better oracle complexity than gradient descent (GD) and stochastic gradient descent (SGD) for certain parameter regimes. We propose two accelerated methods for the ERM problem based on LPI-GD and show an oracle complexity of <math> <mrow><mover><mi>O</mi> <mo>˜</mo></mover> <mrow><mo>(</mo> <mrow><msqrt><mi>σ</mi></msqrt> <msup><mi>m</mi> <mi>d</mi></msup> <mspace></mspace> <mtext>log</mtext> <mo>(</mo> <mn>1</mn> <mo>/</mo> <mi>ε</mi> <mo>)</mo></mrow> <mo>)</mo></mrow> </mrow> </math> . Moreover, we provide the first empirical study on local polynomial interpolation-based gradient methods and corroborate that LPI-GD has better performance than GD and SGD in some scenarios, and the proposed methods achieve acceleration.</p>","PeriodicalId":72704,"journal":{"name":"Control Conference (ECC) ... European. European Control Conference","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9581727/pdf/nihms-1842409.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Control Conference (ECC) ... European. European Control Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ecc55457.2022.9838261","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/8/5 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We study the acceleration of the Local Polynomial Interpolation-based Gradient Descent method (LPI-GD) recently proposed for the approximate solution of empirical risk minimization problems (ERM). We focus on loss functions that are strongly convex and smooth with condition number σ. We additionally assume the loss function is η-Hölder continuous with respect to the data. The oracle complexity of LPI-GD is O ˜ ( σ m d log ( 1 / ε ) ) for a desired accuracy ε, where d is the dimension of the parameter space, and m is the cardinality of an approximation grid. The factor m d can be shown to scale as O((1/ε) d/2η ). LPI-GD has been shown to have better oracle complexity than gradient descent (GD) and stochastic gradient descent (SGD) for certain parameter regimes. We propose two accelerated methods for the ERM problem based on LPI-GD and show an oracle complexity of O ˜ ( σ m d log ( 1 / ε ) ) . Moreover, we provide the first empirical study on local polynomial interpolation-based gradient methods and corroborate that LPI-GD has better performance than GD and SGD in some scenarios, and the proposed methods achieve acceleration.

Abstract Image

Abstract Image

Abstract Image

利用局部多项式回归加速基于梯度的经验风险最小化。
我们研究了最近为近似解决经验风险最小化问题(ERM)而提出的基于局部多项式插值的梯度下降法(LPI-GD)的加速问题。我们将重点放在条件数为 σ 的强凸平滑损失函数上。此外,我们还假设损失函数相对于数据是 η-Hölder 连续的。对于期望精度 ε,LPI-GD 的甲骨文复杂度为 O ˜ ( σ m d log ( 1 / ε ) ) ,其中 d 是参数空间的维数,m 是近似网格的最小值。m d 因子可按 O((1/ε) d/2η ) 的比例缩放。在某些参数条件下,LPI-GD 比梯度下降法(GD)和随机梯度下降法(SGD)具有更好的算法复杂性。我们提出了两种基于 LPI-GD 的 ERM 问题加速方法,结果表明其算法复杂度为 O ˜ ( σ m d log ( 1 / ε ) ) 。此外,我们首次对基于局部多项式插值的梯度方法进行了实证研究,证实了 LPI-GD 在某些情况下比 GD 和 SGD 具有更好的性能,而且所提出的方法实现了加速。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信