一阶随机优化与模糊噪声

arXiv - CS - Data Structures and Algorithms Pub Date : 2024-08-04 DOI:arxiv-2408.02090

Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos

{"title":"一阶随机优化与模糊噪声","authors":"Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos","doi":"arxiv-2408.02090","DOIUrl":null,"url":null,"abstract":"We initiate the study of stochastic optimization with oblivious noise,\nbroadly generalizing the standard heavy-tailed noise setup. In our setting, in\naddition to random observation noise, the stochastic gradient may be subject to\nindependent oblivious noise, which may not have bounded moments and is not\nnecessarily centered. Specifically, we assume access to a noisy oracle for the\nstochastic gradient of $f$ at $x$, which returns a vector $\\nabla f(\\gamma, x)\n+ \\xi$, where $\\gamma$ is the bounded variance observation noise and $\\xi$ is\nthe oblivious noise that is independent of $\\gamma$ and $x$. The only\nassumption we make on the oblivious noise $\\xi$ is that $\\mathbf{Pr}[\\xi = 0]\n\\ge \\alpha$ for some $\\alpha \\in (0, 1)$. In this setting, it is not\ninformation-theoretically possible to recover a single solution close to the\ntarget when the fraction of inliers $\\alpha$ is less than $1/2$. Our main\nresult is an efficient list-decodable learner that recovers a small list of\ncandidates, at least one of which is close to the true solution. On the other\nhand, if $\\alpha = 1-\\epsilon$, where $0< \\epsilon < 1/2$ is sufficiently small\nconstant, the algorithm recovers a single solution. Along the way, we develop a\nrejection-sampling-based algorithm to perform noisy location estimation, which\nmay be of independent interest.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"27 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"First Order Stochastic Optimization with Oblivious Noise\",\"authors\":\"Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos\",\"doi\":\"arxiv-2408.02090\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We initiate the study of stochastic optimization with oblivious noise,\\nbroadly generalizing the standard heavy-tailed noise setup. In our setting, in\\naddition to random observation noise, the stochastic gradient may be subject to\\nindependent oblivious noise, which may not have bounded moments and is not\\nnecessarily centered. Specifically, we assume access to a noisy oracle for the\\nstochastic gradient of $f$ at $x$, which returns a vector $\\\\nabla f(\\\\gamma, x)\\n+ \\\\xi$, where $\\\\gamma$ is the bounded variance observation noise and $\\\\xi$ is\\nthe oblivious noise that is independent of $\\\\gamma$ and $x$. The only\\nassumption we make on the oblivious noise $\\\\xi$ is that $\\\\mathbf{Pr}[\\\\xi = 0]\\n\\\\ge \\\\alpha$ for some $\\\\alpha \\\\in (0, 1)$. In this setting, it is not\\ninformation-theoretically possible to recover a single solution close to the\\ntarget when the fraction of inliers $\\\\alpha$ is less than $1/2$. Our main\\nresult is an efficient list-decodable learner that recovers a small list of\\ncandidates, at least one of which is close to the true solution. On the other\\nhand, if $\\\\alpha = 1-\\\\epsilon$, where $0< \\\\epsilon < 1/2$ is sufficiently small\\nconstant, the algorithm recovers a single solution. Along the way, we develop a\\nrejection-sampling-based algorithm to perform noisy location estimation, which\\nmay be of independent interest.\",\"PeriodicalId\":501525,\"journal\":{\"name\":\"arXiv - CS - Data Structures and Algorithms\",\"volume\":\"27 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Data Structures and Algorithms\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.02090\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Data Structures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.02090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们开始研究具有遗忘噪声的随机优化问题，这是对标准重尾噪声设置的广义概括。在我们的设置中，除了随机观测噪声外，随机梯度还可能受到独立的遗忘噪声的影响，这种噪声可能没有有界矩，也不一定居中。具体来说，我们假定可以获得一个在$x$处的$f$随机梯度的噪声oracle，它返回一个向量$\nabla f(\gamma, x)+ \xi$，其中$\gamma$是有界方差观测噪声，$\xi$是独立于$\gamma$和$x$的遗忘噪声。我们对遗忘噪声 $\xi$ 所做的唯一假设是 $\mathbf{Pr}[\xi = 0]\ge \alpha$ 为（0，1）$ 中的某个 $\alpha 值。在这种情况下，当离群值 $\alpha$ 小于 1/2$ 时，从信息论上讲不可能恢复出接近目标的单一解。我们的主要成果是一种高效的可列表解码学习器，它能恢复一小部分候选列表，其中至少有一个接近真解。另一方面，如果 $\alpha = 1-\epsilon$，其中 $0< \epsilon < 1/2$ 是足够小的常数，那么该算法就会恢复一个单一的解。在此过程中，我们还开发了一种基于投影采样的算法，用于进行噪声位置估计，这可能会引起人们的兴趣。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

First Order Stochastic Optimization with Oblivious Noise

We initiate the study of stochastic optimization with oblivious noise, broadly generalizing the standard heavy-tailed noise setup. In our setting, in addition to random observation noise, the stochastic gradient may be subject to independent oblivious noise, which may not have bounded moments and is not necessarily centered. Specifically, we assume access to a noisy oracle for the stochastic gradient of $f$ at $x$, which returns a vector $\nabla f(\gamma, x) + \xi$, where $\gamma$ is the bounded variance observation noise and $\xi$ is the oblivious noise that is independent of $\gamma$ and $x$. The only assumption we make on the oblivious noise $\xi$ is that $\mathbf{Pr}[\xi = 0] \ge \alpha$ for some $\alpha \in (0, 1)$. In this setting, it is not information-theoretically possible to recover a single solution close to the target when the fraction of inliers $\alpha$ is less than $1/2$. Our main result is an efficient list-decodable learner that recovers a small list of candidates, at least one of which is close to the true solution. On the other hand, if $\alpha = 1-\epsilon$, where $0< \epsilon < 1/2$ is sufficiently small constant, the algorithm recovers a single solution. Along the way, we develop a rejection-sampling-based algorithm to perform noisy location estimation, which may be of independent interest.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Data Structures and Algorithms

自引率

0.00%

发文量