{"title":"Sample complexity analysis for adaptive optimization algorithms with stochastic oracles","authors":"Billy Jin, Katya Scheinberg, Miaolan Xie","doi":"10.1007/s10107-024-02078-z","DOIUrl":"https://doi.org/10.1007/s10107-024-02078-z","url":null,"abstract":"<p>Several classical adaptive optimization algorithms, such as line search and trust-region methods, have been recently extended to stochastic settings where function values, gradients, and Hessians in some cases, are estimated via stochastic oracles. Unlike the majority of stochastic methods, these methods do not use a pre-specified sequence of step size parameters, but adapt the step size parameter according to the estimated progress of the algorithm and use it to dictate the accuracy required from the stochastic oracles. The requirements on the stochastic oracles are, thus, also adaptive and the oracle costs can vary from iteration to iteration. The step size parameters in these methods can increase and decrease based on the perceived progress, but unlike the deterministic case they are not bounded away from zero due to possible oracle failures, and bounds on the step size parameter have not been previously derived. This creates obstacles in the total complexity analysis of such methods, because the oracle costs are typically decreasing in the step size parameter, and could be arbitrarily large as the step size parameter goes to 0. Thus, until now only the total iteration complexity of these methods has been analyzed. In this paper, we derive a lower bound on the step size parameter that holds with high probability for a large class of adaptive stochastic methods. We then use this lower bound to derive a framework for analyzing the expected and high probability total oracle complexity of any method in this class. Finally, we apply this framework to analyze the total sample complexity of two particular algorithms, STORM (Blanchet et al. in INFORMS J Optim 1(2):92–119, 2019) and SASS (Jin et al. in High probability complexity bounds for adaptive step search based on stochastic oracles, 2021. https://doi.org/10.48550/ARXIV.2106.06454), in the expected risk minimization problem.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140886228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph coloring and semidefinite rank","authors":"Renee Mirka, Devin Smedira, David P. Williamson","doi":"10.1007/s10107-024-02085-0","DOIUrl":"https://doi.org/10.1007/s10107-024-02085-0","url":null,"abstract":"","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140660160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Dadush, Friedrich Eisenbrand, Thomas Rothvoss
{"title":"From approximate to exact integer programming","authors":"Daniel Dadush, Friedrich Eisenbrand, Thomas Rothvoss","doi":"10.1007/s10107-024-02084-1","DOIUrl":"https://doi.org/10.1007/s10107-024-02084-1","url":null,"abstract":"<p>Approximate integer programming is the following: For a given convex body <span>(K subseteq {mathbb {R}}^n)</span>, either determine whether <span>(K cap {mathbb {Z}}^n)</span> is empty, or find an integer point in the convex body <span>(2cdot (K - c) +c)</span> which is <i>K</i>, scaled by 2 from its center of gravity <i>c</i>. Approximate integer programming can be solved in time <span>(2^{O(n)})</span> while the fastest known methods for exact integer programming run in time <span>(2^{O(n)} cdot n^n)</span>. So far, there are no efficient methods for integer programming known that are based on approximate integer programming. Our main contribution are two such methods, each yielding novel complexity results. First, we show that an integer point <span>(x^* in (K cap {mathbb {Z}}^n))</span> can be found in time <span>(2^{O(n)})</span>, provided that the <i>remainders</i> of each component <span>(x_i^* mod ell )</span> for some arbitrarily fixed <span>(ell ge 5(n+1))</span> of <span>(x^*)</span> are given. The algorithm is based on a <i>cutting-plane technique</i>, iteratively halving the volume of the feasible set. The cutting planes are determined via approximate integer programming. Enumeration of the possible remainders gives a <span>(2^{O(n)}n^n)</span> algorithm for general integer programming. This matches the current best bound of an algorithm by Dadush (Integer programming, lattice algorithms, and deterministic, vol. Estimation. Georgia Institute of Technology, Atlanta, 2012) that is considerably more involved. Our algorithm also relies on a new <i>asymmetric approximate Carathéodory theorem</i> that might be of interest on its own. Our second method concerns integer programming problems in equation-standard form <span>(Ax = b, 0 le x le u, , x in {mathbb {Z}}^n)</span>. Such a problem can be reduced to the solution of <span>(prod _i O(log u_i +1))</span> approximate integer programming problems. This implies, for example that <i>knapsack</i> or <i>subset-sum</i> problems with <i>polynomial variable range</i> <span>(0 le x_i le p(n))</span> can be solved in time <span>((log n)^{O(n)})</span>. For these problems, the best running time so far was <span>(n^n cdot 2^{O(n)})</span>.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140885825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A normal fan projection algorithm for low-rank optimization","authors":"James R. Scott, Joseph Geunes","doi":"10.1007/s10107-024-02079-y","DOIUrl":"https://doi.org/10.1007/s10107-024-02079-y","url":null,"abstract":"","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140667711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Satoru Fujishige, Tomonari Kitahara, László A. Végh
{"title":"An update-and-stabilize framework for the minimum-norm-point problem","authors":"Satoru Fujishige, Tomonari Kitahara, László A. Végh","doi":"10.1007/s10107-024-02077-0","DOIUrl":"https://doi.org/10.1007/s10107-024-02077-0","url":null,"abstract":"<p>We consider the minimum-norm-point (MNP) problem over polyhedra, a well-studied problem that encompasses linear programming. We present a general algorithmic framework that combines two fundamental approaches for this problem: active set methods and first order methods. Our algorithm performs first order update steps, followed by iterations that aim to ‘stabilize’ the current iterate with additional projections, i.e., find a locally optimal solution whilst keeping the current tight inequalities. Such steps have been previously used in active set methods for the nonnegative least squares (NNLS) problem. We bound on the number of iterations polynomially in the dimension and in the associated circuit imbalance measure. In particular, the algorithm is strongly polynomial for network flow instances. Classical NNLS algorithms such as the Lawson–Hanson algorithm are special instantiations of our framework; as a consequence, we obtain convergence bounds for these algorithms. Our preliminary computational experiments show promising practical performance.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140885829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extended convergence analysis of the Scholtes-type regularization for cardinality-constrained optimization problems","authors":"Sebastian Lämmel, Vladimir Shikhman","doi":"10.1007/s10107-024-02082-3","DOIUrl":"https://doi.org/10.1007/s10107-024-02082-3","url":null,"abstract":"<p>We extend the convergence analysis of the Scholtes-type regularization method for cardinality-constrained optimization problems. Its behavior is clarified in the vicinity of saddle points, and not just of minimizers as it has been done in the literature before. This becomes possible by using as an intermediate step the recently introduced regularized continuous reformulation of a cardinality-constrained optimization problem. We show that the Scholtes-type regularization method is well-defined locally around a nondegenerate T-stationary point of this regularized continuous reformulation. Moreover, the nondegenerate Karush–Kuhn–Tucker points of the corresponding Scholtes-type regularization converge to a T-stationary point having the same index, i.e. its topological type persists. As consequence, we conclude that the global structure of the Scholtes-type regularization essentially coincides with that of CCOP.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140885824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Compressing branch-and-bound trees","authors":"Gonzalo Muñoz, Joseph Paat, Álinson S. Xavier","doi":"10.1007/s10107-024-02080-5","DOIUrl":"https://doi.org/10.1007/s10107-024-02080-5","url":null,"abstract":"<p>A branch-and-bound (BB) tree certifies a dual bound on the value of an integer program. In this work, we introduce the tree compression problem (TCP): <i>Given a BB tree</i> <i>T</i> <i>that certifies a dual bound, can we obtain a smaller tree with the same (or stronger) bound by either (1) applying a different disjunction at some node in</i> <i>T</i> <i>or (2) removing leaves from</i> <i>T</i>? We believe such post-hoc analysis of BB trees may assist in identifying helpful general disjunctions in BB algorithms. We initiate our study by considering computational complexity and limitations of TCP. We then conduct experiments to evaluate the compressibility of realistic branch-and-bound trees generated by commonly-used branching strategies, using both an exact and a heuristic compression algorithm.</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140595792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alessandro Rudi, Ulysse Marteau-Ferey, Francis Bach
{"title":"Finding global minima via kernel approximations","authors":"Alessandro Rudi, Ulysse Marteau-Ferey, Francis Bach","doi":"10.1007/s10107-024-02081-4","DOIUrl":"https://doi.org/10.1007/s10107-024-02081-4","url":null,"abstract":"<p>We consider the global minimization of smooth functions based solely on function evaluations. Algorithms that achieve the optimal number of function evaluations for a given precision level typically rely on explicitly constructing an approximation of the function which is then minimized with algorithms that have exponential running-time complexity. In this paper, we consider an approach that jointly models the function to approximate and finds a global minimum. This is done by using infinite sums of square smooth functions and has strong links with polynomial sum-of-squares hierarchies. Leveraging recent representation properties of reproducing kernel Hilbert spaces, the infinite-dimensional optimization problem can be solved by subsampling in time polynomial in the number of function evaluations, and with theoretical guarantees on the obtained minimum. Given <i>n</i> samples, the computational cost is <span>(O(n^{3.5}))</span> in time, <span>(O(n^2))</span> in space, and we achieve a convergence rate to the global optimum that is <span>(O(n^{-m/d + 1/2 + 3/d}))</span> where <i>m</i> is the degree of differentiability of the function and <i>d</i> the number of dimensions. The rate is nearly optimal in the case of Sobolev functions and more generally makes the proposed method particularly suitable for functions with many derivatives. Indeed, when <i>m</i> is in the order of <i>d</i>, the convergence rate to the global optimum does not suffer from the curse of dimensionality, which affects only the worst-case constants (that we track explicitly through the paper).\u0000</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140595779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stackelberg risk preference design","authors":"","doi":"10.1007/s10107-024-02083-2","DOIUrl":"https://doi.org/10.1007/s10107-024-02083-2","url":null,"abstract":"<h3>Abstract</h3> <p>Risk measures are commonly used to capture the risk preferences of decision-makers (DMs). The decisions of DMs can be nudged or manipulated when their risk preferences are influenced by factors such as the availability of information about the uncertainties. This work proposes a Stackelberg risk preference design (STRIPE) problem to capture a designer’s incentive to influence DMs’ risk preferences. STRIPE consists of two levels. In the lower level, individual DMs in a population, known as the followers, respond to uncertainties according to their risk preference types. In the upper level, the leader influences the distribution of the types to induce targeted decisions and steers the follower’s preferences to it. Our analysis centers around the solution concept of approximate Stackelberg equilibrium that yields suboptimal behaviors of the players. We show the existence of the approximate Stackelberg equilibrium. The primitive risk perception gap, defined as the Wasserstein distance between the original and the target type distributions, is important in estimating the optimal design cost. We connect the leader’s optimality compromise on the cost with her ambiguity tolerance on the follower’s approximate solutions leveraging Lipschitzian properties of the lower level solution mapping. To obtain the Stackelberg equilibrium, we reformulate STRIPE into a single-level optimization problem using the spectral representations of law-invariant coherent risk measures. We create a data-driven approach for computation and study its performance guarantees. We apply STRIPE to contract design problems under approximate incentive compatibility. Moreover, we connect STRIPE with meta-learning problems and derive adaptation performance estimates of the meta-parameters. </p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140596179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}