一阶随机优化与模糊噪声

Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos
{"title":"一阶随机优化与模糊噪声","authors":"Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos","doi":"arxiv-2408.02090","DOIUrl":null,"url":null,"abstract":"We initiate the study of stochastic optimization with oblivious noise,\nbroadly generalizing the standard heavy-tailed noise setup. In our setting, in\naddition to random observation noise, the stochastic gradient may be subject to\nindependent oblivious noise, which may not have bounded moments and is not\nnecessarily centered. Specifically, we assume access to a noisy oracle for the\nstochastic gradient of $f$ at $x$, which returns a vector $\\nabla f(\\gamma, x)\n+ \\xi$, where $\\gamma$ is the bounded variance observation noise and $\\xi$ is\nthe oblivious noise that is independent of $\\gamma$ and $x$. The only\nassumption we make on the oblivious noise $\\xi$ is that $\\mathbf{Pr}[\\xi = 0]\n\\ge \\alpha$ for some $\\alpha \\in (0, 1)$. In this setting, it is not\ninformation-theoretically possible to recover a single solution close to the\ntarget when the fraction of inliers $\\alpha$ is less than $1/2$. Our main\nresult is an efficient list-decodable learner that recovers a small list of\ncandidates, at least one of which is close to the true solution. On the other\nhand, if $\\alpha = 1-\\epsilon$, where $0< \\epsilon < 1/2$ is sufficiently small\nconstant, the algorithm recovers a single solution. Along the way, we develop a\nrejection-sampling-based algorithm to perform noisy location estimation, which\nmay be of independent interest.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"27 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"First Order Stochastic Optimization with Oblivious Noise\",\"authors\":\"Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos\",\"doi\":\"arxiv-2408.02090\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We initiate the study of stochastic optimization with oblivious noise,\\nbroadly generalizing the standard heavy-tailed noise setup. In our setting, in\\naddition to random observation noise, the stochastic gradient may be subject to\\nindependent oblivious noise, which may not have bounded moments and is not\\nnecessarily centered. Specifically, we assume access to a noisy oracle for the\\nstochastic gradient of $f$ at $x$, which returns a vector $\\\\nabla f(\\\\gamma, x)\\n+ \\\\xi$, where $\\\\gamma$ is the bounded variance observation noise and $\\\\xi$ is\\nthe oblivious noise that is independent of $\\\\gamma$ and $x$. The only\\nassumption we make on the oblivious noise $\\\\xi$ is that $\\\\mathbf{Pr}[\\\\xi = 0]\\n\\\\ge \\\\alpha$ for some $\\\\alpha \\\\in (0, 1)$. In this setting, it is not\\ninformation-theoretically possible to recover a single solution close to the\\ntarget when the fraction of inliers $\\\\alpha$ is less than $1/2$. Our main\\nresult is an efficient list-decodable learner that recovers a small list of\\ncandidates, at least one of which is close to the true solution. On the other\\nhand, if $\\\\alpha = 1-\\\\epsilon$, where $0< \\\\epsilon < 1/2$ is sufficiently small\\nconstant, the algorithm recovers a single solution. Along the way, we develop a\\nrejection-sampling-based algorithm to perform noisy location estimation, which\\nmay be of independent interest.\",\"PeriodicalId\":501525,\"journal\":{\"name\":\"arXiv - CS - Data Structures and Algorithms\",\"volume\":\"27 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Data Structures and Algorithms\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.02090\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Data Structures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.02090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

我们开始研究具有遗忘噪声的随机优化问题,这是对标准重尾噪声设置的广义概括。在我们的设置中,除了随机观测噪声外,随机梯度还可能受到独立的遗忘噪声的影响,这种噪声可能没有有界矩,也不一定居中。具体来说,我们假定可以获得一个在$x$处的$f$随机梯度的噪声oracle,它返回一个向量$\nabla f(\gamma, x)+ \xi$,其中$\gamma$是有界方差观测噪声,$\xi$是独立于$\gamma$和$x$的遗忘噪声。我们对遗忘噪声 $\xi$ 所做的唯一假设是 $\mathbf{Pr}[\xi = 0]\ge \alpha$ 为(0,1)$ 中的某个 $\alpha 值。在这种情况下,当离群值 $\alpha$ 小于 1/2$ 时,从信息论上讲不可能恢复出接近目标的单一解。我们的主要成果是一种高效的可列表解码学习器,它能恢复一小部分候选列表,其中至少有一个接近真解。另一方面,如果 $\alpha = 1-\epsilon$,其中 $0< \epsilon < 1/2$ 是足够小的常数,那么该算法就会恢复一个单一的解。在此过程中,我们还开发了一种基于投影采样的算法,用于进行噪声位置估计,这可能会引起人们的兴趣。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
First Order Stochastic Optimization with Oblivious Noise
We initiate the study of stochastic optimization with oblivious noise, broadly generalizing the standard heavy-tailed noise setup. In our setting, in addition to random observation noise, the stochastic gradient may be subject to independent oblivious noise, which may not have bounded moments and is not necessarily centered. Specifically, we assume access to a noisy oracle for the stochastic gradient of $f$ at $x$, which returns a vector $\nabla f(\gamma, x) + \xi$, where $\gamma$ is the bounded variance observation noise and $\xi$ is the oblivious noise that is independent of $\gamma$ and $x$. The only assumption we make on the oblivious noise $\xi$ is that $\mathbf{Pr}[\xi = 0] \ge \alpha$ for some $\alpha \in (0, 1)$. In this setting, it is not information-theoretically possible to recover a single solution close to the target when the fraction of inliers $\alpha$ is less than $1/2$. Our main result is an efficient list-decodable learner that recovers a small list of candidates, at least one of which is close to the true solution. On the other hand, if $\alpha = 1-\epsilon$, where $0< \epsilon < 1/2$ is sufficiently small constant, the algorithm recovers a single solution. Along the way, we develop a rejection-sampling-based algorithm to perform noisy location estimation, which may be of independent interest.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信