{"title":"Smoothed Analysis with Adaptive Adversaries","authors":"Nika Haghtalab, Tim Roughgarden, Abhishek Shetty","doi":"10.1145/3656638","DOIUrl":null,"url":null,"abstract":"<p>We prove novel algorithmic guarantees for several online problems in the smoothed analysis model. In this model, at each time step an adversary chooses an input distribution with density function bounded above pointwise by \\(\\tfrac{1}{\\sigma } \\) times that of the uniform distribution; nature then samples an input from this distribution. Here, <i>σ</i> is a parameter that interpolates between the extremes of worst-case and average case analysis. Crucially, our results hold for <i>adaptive</i> adversaries that can base their choice of an input distribution on the decisions of the algorithm and the realizations of the inputs in the previous time steps. An adaptive adversary can nontrivially correlate inputs at different time steps with each other and with the algorithm’s current state; this appears to rule out the standard proof approaches in smoothed analysis. </p><p>This paper presents a general technique for proving smoothed algorithmic guarantees against adaptive adversaries, in effect reducing the setting of an adaptive adversary to the much simpler case of an oblivious adversary (i.e., an adversary that commits in advance to the entire sequence of input distributions). We apply this technique to prove strong smoothed guarantees for three different problems: <p><table border=\"0\" list-type=\"ordered\" width=\"95%\"><tr><td valign=\"top\"><p>(1)</p></td><td colspan=\"5\" valign=\"top\"><p>Online learning: We consider the online prediction problem, where instances are generated from an adaptive sequence of <i>σ</i>-smooth distributions and the hypothesis class has VC dimension <i>d</i>. We bound the regret by \\(\\tilde{O}\\big (\\sqrt {T d\\ln (1/\\sigma)} + d\\ln (T/\\sigma) \\big) \\) and provide a near-matching lower bound. Our result shows that under smoothed analysis, learnability against adaptive adversaries is characterized by the finiteness of the VC dimension. This is as opposed to the worst-case analysis, where online learnability is characterized by Littlestone dimension (which is infinite even in the extremely restricted case of one-dimensional threshold functions). Our results fully answer an open question of Rakhlin et al. [64]. </p></td></tr><tr><td valign=\"top\"><p>(2)</p></td><td colspan=\"5\" valign=\"top\"><p>Online discrepancy minimization: We consider the setting of the online Komlós problem, where the input is generated from an adaptive sequence of <i>σ</i>-smooth and isotropic distributions on the ℓ<sub>2</sub> unit ball. We bound the ℓ<sub>∞</sub> norm of the discrepancy vector by \\(\\tilde{O}\\big (\\ln ^2\\big (\\frac{nT}{\\sigma }\\big) \\big) \\). This is as opposed to the worst-case analysis, where the tight discrepancy bound is \\(\\Theta (\\sqrt {T/n}) \\). We show such polylog(<i>nT</i>/<i>σ</i>) discrepancy guarantees are not achievable for non-isotropic <i>σ</i>-smooth distributions. </p></td></tr><tr><td valign=\"top\"><p>(3)</p></td><td colspan=\"5\" valign=\"top\"><p>Dispersion in online optimization: We consider online optimization with piecewise Lipschitz functions where functions with ℓ discontinuities are chosen by a smoothed adaptive adversary and show that the resulting sequence is \\(\\big ({\\sigma }/{\\sqrt {T\\ell }}, \\tilde{O}\\big (\\sqrt {T\\ell } \\big)\\big) \\)-dispersed. That is, every ball of radius \\({\\sigma }/{\\sqrt {T\\ell }} \\) is split by \\(\\tilde{O}\\big (\\sqrt {T\\ell } \\big) \\) of the partitions made by these functions. This result matches the dispersion parameters of Balcan et al. [13] for oblivious smooth adversaries, up to logarithmic factors. On the other hand, worst-case sequences are trivially (0, <i>T</i>)-dispersed.</p></td></tr></table></p></p>","PeriodicalId":50022,"journal":{"name":"Journal of the ACM","volume":"72 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the ACM","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3656638","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
We prove novel algorithmic guarantees for several online problems in the smoothed analysis model. In this model, at each time step an adversary chooses an input distribution with density function bounded above pointwise by \(\tfrac{1}{\sigma } \) times that of the uniform distribution; nature then samples an input from this distribution. Here, σ is a parameter that interpolates between the extremes of worst-case and average case analysis. Crucially, our results hold for adaptive adversaries that can base their choice of an input distribution on the decisions of the algorithm and the realizations of the inputs in the previous time steps. An adaptive adversary can nontrivially correlate inputs at different time steps with each other and with the algorithm’s current state; this appears to rule out the standard proof approaches in smoothed analysis.
This paper presents a general technique for proving smoothed algorithmic guarantees against adaptive adversaries, in effect reducing the setting of an adaptive adversary to the much simpler case of an oblivious adversary (i.e., an adversary that commits in advance to the entire sequence of input distributions). We apply this technique to prove strong smoothed guarantees for three different problems:
(1)
Online learning: We consider the online prediction problem, where instances are generated from an adaptive sequence of σ-smooth distributions and the hypothesis class has VC dimension d. We bound the regret by \(\tilde{O}\big (\sqrt {T d\ln (1/\sigma)} + d\ln (T/\sigma) \big) \) and provide a near-matching lower bound. Our result shows that under smoothed analysis, learnability against adaptive adversaries is characterized by the finiteness of the VC dimension. This is as opposed to the worst-case analysis, where online learnability is characterized by Littlestone dimension (which is infinite even in the extremely restricted case of one-dimensional threshold functions). Our results fully answer an open question of Rakhlin et al. [64].
(2)
Online discrepancy minimization: We consider the setting of the online Komlós problem, where the input is generated from an adaptive sequence of σ-smooth and isotropic distributions on the ℓ2 unit ball. We bound the ℓ∞ norm of the discrepancy vector by \(\tilde{O}\big (\ln ^2\big (\frac{nT}{\sigma }\big) \big) \). This is as opposed to the worst-case analysis, where the tight discrepancy bound is \(\Theta (\sqrt {T/n}) \). We show such polylog(nT/σ) discrepancy guarantees are not achievable for non-isotropic σ-smooth distributions.
(3)
Dispersion in online optimization: We consider online optimization with piecewise Lipschitz functions where functions with ℓ discontinuities are chosen by a smoothed adaptive adversary and show that the resulting sequence is \(\big ({\sigma }/{\sqrt {T\ell }}, \tilde{O}\big (\sqrt {T\ell } \big)\big) \)-dispersed. That is, every ball of radius \({\sigma }/{\sqrt {T\ell }} \) is split by \(\tilde{O}\big (\sqrt {T\ell } \big) \) of the partitions made by these functions. This result matches the dispersion parameters of Balcan et al. [13] for oblivious smooth adversaries, up to logarithmic factors. On the other hand, worst-case sequences are trivially (0, T)-dispersed.
期刊介绍:
The best indicator of the scope of the journal is provided by the areas covered by its Editorial Board. These areas change from time to time, as the field evolves. The following areas are currently covered by a member of the Editorial Board: Algorithms and Combinatorial Optimization; Algorithms and Data Structures; Algorithms, Combinatorial Optimization, and Games; Artificial Intelligence; Complexity Theory; Computational Biology; Computational Geometry; Computer Graphics and Computer Vision; Computer-Aided Verification; Cryptography and Security; Cyber-Physical, Embedded, and Real-Time Systems; Database Systems and Theory; Distributed Computing; Economics and Computation; Information Theory; Logic and Computation; Logic, Algorithms, and Complexity; Machine Learning and Computational Learning Theory; Networking; Parallel Computing and Architecture; Programming Languages; Quantum Computing; Randomized Algorithms and Probabilistic Analysis of Algorithms; Scientific Computing and High Performance Computing; Software Engineering; Web Algorithms and Data Mining