Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos
{"title":"一阶随机优化与模糊噪声","authors":"Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos","doi":"arxiv-2408.02090","DOIUrl":null,"url":null,"abstract":"We initiate the study of stochastic optimization with oblivious noise,\nbroadly generalizing the standard heavy-tailed noise setup. In our setting, in\naddition to random observation noise, the stochastic gradient may be subject to\nindependent oblivious noise, which may not have bounded moments and is not\nnecessarily centered. Specifically, we assume access to a noisy oracle for the\nstochastic gradient of $f$ at $x$, which returns a vector $\\nabla f(\\gamma, x)\n+ \\xi$, where $\\gamma$ is the bounded variance observation noise and $\\xi$ is\nthe oblivious noise that is independent of $\\gamma$ and $x$. The only\nassumption we make on the oblivious noise $\\xi$ is that $\\mathbf{Pr}[\\xi = 0]\n\\ge \\alpha$ for some $\\alpha \\in (0, 1)$. In this setting, it is not\ninformation-theoretically possible to recover a single solution close to the\ntarget when the fraction of inliers $\\alpha$ is less than $1/2$. Our main\nresult is an efficient list-decodable learner that recovers a small list of\ncandidates, at least one of which is close to the true solution. On the other\nhand, if $\\alpha = 1-\\epsilon$, where $0< \\epsilon < 1/2$ is sufficiently small\nconstant, the algorithm recovers a single solution. Along the way, we develop a\nrejection-sampling-based algorithm to perform noisy location estimation, which\nmay be of independent interest.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"27 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"First Order Stochastic Optimization with Oblivious Noise\",\"authors\":\"Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos\",\"doi\":\"arxiv-2408.02090\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We initiate the study of stochastic optimization with oblivious noise,\\nbroadly generalizing the standard heavy-tailed noise setup. In our setting, in\\naddition to random observation noise, the stochastic gradient may be subject to\\nindependent oblivious noise, which may not have bounded moments and is not\\nnecessarily centered. Specifically, we assume access to a noisy oracle for the\\nstochastic gradient of $f$ at $x$, which returns a vector $\\\\nabla f(\\\\gamma, x)\\n+ \\\\xi$, where $\\\\gamma$ is the bounded variance observation noise and $\\\\xi$ is\\nthe oblivious noise that is independent of $\\\\gamma$ and $x$. The only\\nassumption we make on the oblivious noise $\\\\xi$ is that $\\\\mathbf{Pr}[\\\\xi = 0]\\n\\\\ge \\\\alpha$ for some $\\\\alpha \\\\in (0, 1)$. In this setting, it is not\\ninformation-theoretically possible to recover a single solution close to the\\ntarget when the fraction of inliers $\\\\alpha$ is less than $1/2$. Our main\\nresult is an efficient list-decodable learner that recovers a small list of\\ncandidates, at least one of which is close to the true solution. On the other\\nhand, if $\\\\alpha = 1-\\\\epsilon$, where $0< \\\\epsilon < 1/2$ is sufficiently small\\nconstant, the algorithm recovers a single solution. Along the way, we develop a\\nrejection-sampling-based algorithm to perform noisy location estimation, which\\nmay be of independent interest.\",\"PeriodicalId\":501525,\"journal\":{\"name\":\"arXiv - CS - Data Structures and Algorithms\",\"volume\":\"27 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Data Structures and Algorithms\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.02090\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Data Structures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.02090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
First Order Stochastic Optimization with Oblivious Noise
We initiate the study of stochastic optimization with oblivious noise,
broadly generalizing the standard heavy-tailed noise setup. In our setting, in
addition to random observation noise, the stochastic gradient may be subject to
independent oblivious noise, which may not have bounded moments and is not
necessarily centered. Specifically, we assume access to a noisy oracle for the
stochastic gradient of $f$ at $x$, which returns a vector $\nabla f(\gamma, x)
+ \xi$, where $\gamma$ is the bounded variance observation noise and $\xi$ is
the oblivious noise that is independent of $\gamma$ and $x$. The only
assumption we make on the oblivious noise $\xi$ is that $\mathbf{Pr}[\xi = 0]
\ge \alpha$ for some $\alpha \in (0, 1)$. In this setting, it is not
information-theoretically possible to recover a single solution close to the
target when the fraction of inliers $\alpha$ is less than $1/2$. Our main
result is an efficient list-decodable learner that recovers a small list of
candidates, at least one of which is close to the true solution. On the other
hand, if $\alpha = 1-\epsilon$, where $0< \epsilon < 1/2$ is sufficiently small
constant, the algorithm recovers a single solution. Along the way, we develop a
rejection-sampling-based algorithm to perform noisy location estimation, which
may be of independent interest.