Information and Inference-A Journal of the Ima最新文献_第4页

Exit Time Analysis for Approximations of Gradient Descent Trajectories Around Saddle Points 鞍点附近梯度下降轨迹近似的退出时间分析

IF 1.6 4区数学

Information and Inference-A Journal of the Ima Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac025

Rishabh Dixit;Mert Gürbüzbalaban;Waheed U Bajwa

{"title":"Exit Time Analysis for Approximations of Gradient Descent Trajectories Around Saddle Points","authors":"Rishabh Dixit;Mert Gürbüzbalaban;Waheed U Bajwa","doi":"10.1093/imaiai/iaac025","DOIUrl":"https://doi.org/10.1093/imaiai/iaac025","url":null,"abstract":"This paper considers the problem of understanding the exit time for trajectories of gradient-related first-order methods from saddle neighborhoods under some initial boundary conditions. Given the ‘flat’ geometry around saddle points, first-order methods can struggle to escape these regions in a fast manner due to the small magnitudes of gradients encountered. In particular, while it is known that gradient-related first-order methods escape strict-saddle neighborhoods, existing analytic techniques do not explicitly leverage the local geometry around saddle points in order to control behavior of gradient trajectories. It is in this context that this paper puts forth a rigorous geometric analysis of the gradient-descent method around strict-saddle neighborhoods using matrix perturbation theory. In doing so, it provides a key result that can be used to generate an approximate gradient trajectory for any given initial conditions. In addition, the analysis leads to a linear exit-time solution for gradient-descent method under certain necessary initial conditions, which explicitly bring out the dependence on problem dimension, conditioning of the saddle neighborhood, and more, for a class of strict-saddle functions.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"714-786"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50297617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Uncertainty quantification in the Bradley–Terry–Luce model Bradley–Terry–Luce模型中的不确定性量化

IF 1.6 4区数学

Information and Inference-A Journal of the Ima Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac032

Chao Gao;Yandi Shen;Anderson Y Zhang

引用次数: 11

Optimal orthogonal group synchronization and rotation group synchronization 最优正交群同步和旋转群同步

IF 1.6 4区数学

Information and Inference-A Journal of the Ima Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac022

Chao Gao;Anderson Y Zhang

引用次数: 7

Fast splitting algorithms for sparsity-constrained and noisy group testing 稀疏性约束和噪声群测试的快速分裂算法

IF 1.6 4区数学

Information and Inference-A Journal of the Ima Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac031

Eric Price;Jonathan Scarlett;Nelvin Tan

{"title":"Fast splitting algorithms for sparsity-constrained and noisy group testing","authors":"Eric Price;Jonathan Scarlett;Nelvin Tan","doi":"10.1093/imaiai/iaac031","DOIUrl":"https://doi.org/10.1093/imaiai/iaac031","url":null,"abstract":"In group testing, the goal is to identify a subset of defective items within a larger set of items based on tests whose outcomes indicate whether at least one defective item is present. This problem is relevant in areas such as medical testing, DNA sequencing, communication protocols and many more. In this paper, we study (i) a sparsity-constrained version of the problem, in which the testing procedure is subjected to one of the following two constraints: items are finitely divisible and thus may participate in at most \u0000<tex>$gamma $</tex>\u0000 tests; or tests are size-constrained to pool no more than \u0000<tex>$rho $</tex>\u0000 items per test; and (ii) a noisy version of the problem, where each test outcome is independently flipped with some constant probability. Under each of these settings, considering the for-each recovery guarantee with asymptotically vanishing error probability, we introduce a fast splitting algorithm and establish its near-optimality not only in terms of the number of tests, but also in terms of the decoding time. While the most basic formulations of our algorithms require \u0000<tex>$varOmega (n)$</tex>\u0000 storage for each algorithm, we also provide low-storage variants based on hashing, with similar recovery guarantees.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"1141-1171"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50297919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

On the robustness to adversarial corruption and to heavy-tailed data of the Stahel–Donoho median of means Stahel–Donoho均值中值对对抗性腐败和重尾数据的稳健性

IF 1.6 4区数学

Information and Inference-A Journal of the Ima Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac026

Jules Depersin;Guillaume Lecué

{"title":"On the robustness to adversarial corruption and to heavy-tailed data of the Stahel–Donoho median of means","authors":"Jules Depersin;Guillaume Lecué","doi":"10.1093/imaiai/iaac026","DOIUrl":"https://doi.org/10.1093/imaiai/iaac026","url":null,"abstract":"We consider median of means (MOM) versions of the Stahel–Donoho outlyingness (SDO) [23, 66] and of the Median Absolute Deviation (MAD) [30] functions to construct subgaussian estimators of a mean vector under adversarial contamination and heavy-tailed data. We develop a single analysis of the MOM version of the SDO which covers all cases ranging from the Gaussian case to the \u0000<tex>$L_2$</tex>\u0000 case. It is based on isomorphic and almost isometric properties of the MOM versions of SDO and MAD. This analysis also covers cases where the mean does not even exist but a location parameter does; in those cases we still recover the same subgaussian rates and the same price for adversarial contamination even though there is not even a first moment. These properties are achieved by the classical SDO median and are therefore the first non-asymptotic statistical bounds on the Stahel–Donoho median complementing the \u0000<tex>$sqrt{n}$</tex>\u0000-consistency [58] and asymptotic normality [74] of the Stahel–Donoho estimators. We also show that the MOM version of MAD can be used to construct an estimator of the covariance matrix only under the existence of a second moment or of a scatter matrix if a second moment does not exist.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"814-850"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50298050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Sparse recovery by reduced variance stochastic approximation 基于降方差随机近似的稀疏恢复

IF 1.6 4区数学

Information and Inference-A Journal of the Ima Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac028

Anatoli Juditsky;Andrei Kulunchakov;Hlib Tsyntseus

{"title":"Sparse recovery by reduced variance stochastic approximation","authors":"Anatoli Juditsky;Andrei Kulunchakov;Hlib Tsyntseus","doi":"10.1093/imaiai/iaac028","DOIUrl":"https://doi.org/10.1093/imaiai/iaac028","url":null,"abstract":"In this paper, we discuss application of iterative Stochastic Optimization routines to the problem of sparse signal recovery from noisy observation. Using Stochastic Mirror Descent algorithm as a building block, we develop a multistage procedure for recovery of sparse solutions to Stochastic Optimization problem under assumption of smoothness and quadratic minoration on the expected objective. An interesting feature of the proposed algorithm is linear convergence of the approximate solution during the preliminary phase of the routine when the component of stochastic error in the gradient observation, which is due to bad initial approximation of the optimal solution, is larger than the ‘ideal’ asymptotic error component owing to observation noise ‘at the optimal solution’. We also show how one can straightforwardly enhance reliability of the corresponding solution using Median-of-Means-like techniques.We illustrate the performance of the proposed algorithms in application to classical problems of recovery of sparse and low-rank signals in the generalized linear regression framework. We show, under rather weak assumption on the regressor and noise distributions, how they lead to parameter estimates which obey (up to factors which are logarithmic in problem dimension and confidence level) the best known accuracy bounds.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"851-896"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50298051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

The geometry of adversarial training in binary classification 二元分类中对抗性训练的几何结构

IF 1.6 4区数学

Information and Inference-A Journal of the Ima Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac029

Leon Bungert;Nicolás García Trillos;Ryan Murray

{"title":"The geometry of adversarial training in binary classification","authors":"Leon Bungert;Nicolás García Trillos;Ryan Murray","doi":"10.1093/imaiai/iaac029","DOIUrl":"https://doi.org/10.1093/imaiai/iaac029","url":null,"abstract":"We establish an equivalence between a family of adversarial training problems for non-parametric binary classification and a family of regularized risk minimization problems where the regularizer is a nonlocal perimeter functional. The resulting regularized risk minimization problems admit exact convex relaxations of the type \u0000<tex>$L^1+text{(nonlocal)}operatorname{TV}$</tex>\u0000, a form frequently studied in image analysis and graph-based learning. A rich geometric structure is revealed by this reformulation which in turn allows us to establish a series of properties of optimal solutions of the original problem, including the existence of minimal and maximal solutions (interpreted in a suitable sense) and the existence of regular solutions (also interpreted in a suitable sense). In addition, we highlight how the connection between adversarial training and perimeter minimization problems provides a novel, directly interpretable, statistical motivation for a family of regularized risk minimization problems involving perimeter/total variation. The majority of our theoretical results are independent of the distance used to define adversarial attacks.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"921-968"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50298053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Nearly minimax-optimal rates for noisy sparse phase retrieval via early-stopped mirror descent 基于早期停止镜像下降的噪声稀疏相位检索的近似极小极大最优速率

IF 1.6 4区数学

Information and Inference-A Journal of the Ima Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac024

Fan Wu;Patrick Rebeschini

{"title":"Nearly minimax-optimal rates for noisy sparse phase retrieval via early-stopped mirror descent","authors":"Fan Wu;Patrick Rebeschini","doi":"10.1093/imaiai/iaac024","DOIUrl":"https://doi.org/10.1093/imaiai/iaac024","url":null,"abstract":"This paper studies early-stopped mirror descent applied to noisy sparse phase retrieval, which is the problem of recovering a \u0000<tex>$k$</tex>\u0000-sparse signal \u0000<tex>$textbf{x}^star in{mathbb{R}}^n$</tex>\u0000 from a set of quadratic Gaussian measurements corrupted by sub-exponential noise. We consider the (non-convex) unregularized empirical risk minimization problem and show that early-stopped mirror descent, when equipped with the hypentropy mirror map and proper initialization, achieves a nearly minimax-optimal rate of convergence, provided the sample size is at least of order \u0000<tex>$k^2$</tex>\u0000 (modulo logarithmic term) and the minimum (in modulus) non-zero entry of the signal is on the order of \u0000<tex>$|textbf{x}^star |_2/sqrt{k}$</tex>\u0000. Our theory leads to a simple algorithm that does not rely on explicit regularization or thresholding steps to promote sparsity. More generally, our results establish a connection between mirror descent and sparsity in the non-convex problem of noisy sparse phase retrieval, adding to the literature on early stopping that has mostly focused on non-sparse, Euclidean and convex settings via gradient descent. Our proof combines a potential-based analysis of mirror descent with a quantitative control on a variational coherence property that we establish along the path of mirror descent, up to a prescribed stopping time.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"633-713"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8016800/10058586/10058608.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50297616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Perturbation bounds for (nearly) orthogonally decomposable tensors with statistical applications （近似）正交可分解张量的扰动界及其统计应用

IF 1.6 4区数学

Information and Inference-A Journal of the Ima Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac033

Arnab Auddy;Ming Yuan

引用次数: 0

Breaking the sample complexity barrier to regret-optimal model-free reinforcement learning 打破样本复杂性障碍后悔最优无模型强化学习

IF 1.6 4区数学

Information and Inference-A Journal of the Ima Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac034

Gen Li;Laixi Shi;Yuxin Chen;Yuejie Chi

{"title":"Breaking the sample complexity barrier to regret-optimal model-free reinforcement learning","authors":"Gen Li;Laixi Shi;Yuxin Chen;Yuejie Chi","doi":"10.1093/imaiai/iaac034","DOIUrl":"https://doi.org/10.1093/imaiai/iaac034","url":null,"abstract":"Achieving sample efficiency in online episodic reinforcement learning (RL) requires optimally balancing exploration and exploitation. When it comes to a finite-horizon episodic Markov decision process with \u0000<tex>$S$</tex>\u0000 states, \u0000<tex>$A$</tex>\u0000 actions and horizon length \u0000<tex>$H$</tex>\u0000, substantial progress has been achieved toward characterizing the minimax-optimal regret, which scales on the order of \u0000<tex>$sqrt{H^2SAT}$</tex>\u0000 (modulo log factors) with \u0000<tex>$T$</tex>\u0000 the total number of samples. While several competing solution paradigms have been proposed to minimize regret, they are either memory-inefficient, or fall short of optimality unless the sample size exceeds an enormous threshold (e.g. \u0000<tex>$S^6A^4 ,mathrm{poly}(H)$</tex>\u0000 for existing model-free methods).To overcome such a large sample size barrier to efficient RL, we design a novel model-free algorithm, with space complexity \u0000<tex>$O(SAH)$</tex>\u0000, that achieves near-optimal regret as soon as the sample size exceeds the order of \u0000<tex>$SA,mathrm{poly}(H)$</tex>\u0000. In terms of this sample size requirement (also referred to the initial burn-in cost), our method improves—by at least a factor of \u0000<tex>$S^5A^3$</tex>\u0000—upon any prior memory-efficient algorithm that is asymptotically regret-optimal. Leveraging the recently introduced variance reduction strategy (also called reference-advantage decomposition), the proposed algorithm employs an early-settled reference update rule, with the aid of two Q-learning sequences with upper and lower confidence bounds. The design principle of our early-settled variance reduction method might be of independent interest to other RL settings that involve intricate exploration–exploitation trade-offs.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"969-1043"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8016800/10058586/10058618.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50298054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 35