{"title":"Improved global performance guarantees of second-order methods in convex minimization","authors":"Pavel Dvurechensky, Yurii Nesterov","doi":"10.1007/s10208-025-09726-6","DOIUrl":"https://doi.org/10.1007/s10208-025-09726-6","url":null,"abstract":"<p>In this paper, we attempt to compare two distinct branches of research on second-order optimization methods. The first one studies self-concordant functions and barriers, the main assumption being that the third derivative of the objective is bounded by the second derivative. The second branch studies cubic regularized Newton methods (CRNMs) with the main assumption that the second derivative is Lipschitz continuous. We develop a new theoretical analysis for a path-following scheme (PFS) for general self-concordant functions, as opposed to the classical path-following scheme developed for self-concordant barriers. We show that the complexity bound for this scheme is better than that of the Damped Newton Method (DNM) and show that our method has global superlinear convergence. We propose also a new predictor-corrector path-following scheme (PCPFS) that leads to further improvement of constant factors in the complexity guarantees for minimizing general self-concordant functions. We also apply path-following schemes to different classes of constrained optimization problems and obtain the resulting complexity bounds. Finally, we analyze an important subclass of general self-concordant functions, namely a class of strongly convex functions with Lipschitz continuous second derivative, and show that for this subclass CRNMs give even better complexity bounds.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"14 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144924525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Algorithms for Mean-Field Variational Inference Via Polyhedral Optimization in the Wasserstein Space","authors":"Yiheng Jiang, Sinho Chewi, Aram-Alexandre Pooladian","doi":"10.1007/s10208-025-09721-x","DOIUrl":"https://doi.org/10.1007/s10208-025-09721-x","url":null,"abstract":"<p>We develop a theory of finite-dimensional polyhedral subsets over the Wasserstein space and optimization of functionals over them via first-order methods. Our main application is to the problem of mean-field variational inference (MFVI), which seeks to approximate a distribution <span><span>pi </span><script type=\"math/tex\">pi </script></span> over <span><span>mathbb {R}^d</span><script type=\"math/tex\">mathbb {R}^d</script></span> by a product measure <span><span>pi ^star </span><script type=\"math/tex\">pi ^star </script></span>. When <span><span>pi </span><script type=\"math/tex\">pi </script></span> is strongly log-concave and log-smooth, we provide (1) approximation rates certifying that <span><span>pi ^star </span><script type=\"math/tex\">pi ^star </script></span> is close to the minimizer <span><span>pi ^star _diamond </span><script type=\"math/tex\">pi ^star _diamond </script></span> of the KL divergence over a <i>polyhedral</i> set <span><span>mathcal {P}_diamond </span><script type=\"math/tex\">mathcal {P}_diamond </script></span>, and (2) an algorithm for minimizing <span><span>mathop {textrm{KL}}limits (cdot !;Vert ; !pi )</span><script type=\"math/tex\">mathop {textrm{KL}}limits (cdot !;Vert ; !pi )</script></span> over <span><span>mathcal {P}_diamond </span><script type=\"math/tex\">mathcal {P}_diamond </script></span> based on accelerated gradient descent over <span><span>mathbb {R}^d</span><script type=\"math/tex\">mathbb {R}^d</script></span>. As a byproduct of our analysis, we obtain the first end-to-end analysis for gradient-based algorithms for MFVI.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"13 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144924678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Time Splitting and Error Estimates for Nonlinear Schrödinger Equations with a Potential","authors":"Rémi Carles","doi":"10.1007/s10208-025-09727-5","DOIUrl":"https://doi.org/10.1007/s10208-025-09727-5","url":null,"abstract":"<p>We consider the nonlinear Schrödinger equation with a potential, also known as Gross-Pitaevskii equation. By introducing a suitable spectral localization, we prove low regularity error estimates for the time discretization corresponding to an adapted Lie-Trotter splitting scheme. The proof is based on tools from spectral theory and pseudodifferential calculus in order to obtain various estimates on the spectral localization, including discrete Strichartz estimates which support the nonlinear analysis.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"38 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144924522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bekzhan Kerimkulov, James-Michael Leahy, David Siska, Lukasz Szpruch, Yufei Zhang
{"title":"A Fisher–Rao Gradient Flow for Entropy-Regularised Markov Decision Processes in Polish Spaces","authors":"Bekzhan Kerimkulov, James-Michael Leahy, David Siska, Lukasz Szpruch, Yufei Zhang","doi":"10.1007/s10208-025-09729-3","DOIUrl":"https://doi.org/10.1007/s10208-025-09729-3","url":null,"abstract":"<p>We study the global convergence of a Fisher–Rao policy gradient flow for infinite-horizon entropy-regularised Markov decision processes with Polish state and action spaces. The flow is a continuous-time analogue of a policy mirror descent method. We establish the global well-posedness of the gradient flow and demonstrate its exponential convergence to the optimal policy. Moreover, we prove the flow is stable with respect to gradient evaluation, offering insights into the performance of a natural policy gradient flow with log-linear policy parameterisation. To overcome challenges stemming from the lack of the convexity of the objective function and the discontinuity arising from the entropy regulariser, we leverage the performance difference lemma and the duality relationship between the gradient and mirror descent flows. Our analysis provides a theoretical foundation for developing various discrete policy gradient algorithms.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"56 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144924524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Niels Lubbes, Mehdi Makhul, Josef Schicho, Audie Warren
{"title":"Irreducible Components of Sets of Points in the Plane that Satisfy Distance Conditions","authors":"Niels Lubbes, Mehdi Makhul, Josef Schicho, Audie Warren","doi":"10.1007/s10208-025-09725-7","DOIUrl":"https://doi.org/10.1007/s10208-025-09725-7","url":null,"abstract":"<p>For a given graph whose edges are labeled with general real numbers, we consider the set of functions from the vertex set into the Euclidean plane such that the distance between the images of neighbouring vertices is equal to the corresponding edge label. This set of functions can be expressed as the zero set of quadratic polynomials and our main result characterizes the number of complex irreducible components of this zero set in terms of combinatorial properties of the graph. In case the complex components are three-dimensional, then the graph is minimally rigid and the component number is a well-known invariant from rigidity theory. If the components are four-dimensional, then they correspond to one-dimensional coupler curves of flexible planar mechanisms. As an application, we characterize the degree of irreducible components of such coupler curves combinatorially.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"27 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144924570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multisymplecticity in Finite Element Exterior Calculus","authors":"Ari Stern, Enrico Zampa","doi":"10.1007/s10208-025-09720-y","DOIUrl":"https://doi.org/10.1007/s10208-025-09720-y","url":null,"abstract":"<p>We consider the application of finite element exterior calculus (FEEC) methods to a class of canonical Hamiltonian PDE systems involving differential forms. Solutions to these systems satisfy a local <i>multisymplectic conservation law</i>, which generalizes the more familiar symplectic conservation law for Hamiltonian systems of ODEs, and which is connected with physically-important reciprocity phenomena, such as Lorentz reciprocity in electromagnetics. We characterize hybrid FEEC methods whose numerical traces satisfy a version of the multisymplectic conservation law, and we apply this characterization to several specific classes of FEEC methods, including conforming Arnold–Falk–Winther-type methods and various hybridizable discontinuous Galerkin (HDG) methods. Interestingly, the HDG-type and other nonconforming methods are shown, in general, to be multisymplectic in a stronger sense than the conforming FEEC methods. This substantially generalizes previous work of McLachlan and Stern [Found. Comput. Math., 20 (2020), pp. 35–69] on the more restricted class of canonical Hamiltonian PDEs in the de Donder–Weyl “grad-div” form.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"1 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144924528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interlacing Polynomial Method for Matrix Approximation via Generalized Column and Row Selection","authors":"Jian-Feng Cai, Zhiqiang Xu, Zili Xu","doi":"10.1007/s10208-025-09719-5","DOIUrl":"https://doi.org/10.1007/s10208-025-09719-5","url":null,"abstract":"<p>This paper delves into the spectral norm aspect of the Generalized Column and Row Subset Selection (GCRSS) problem. Given a target matrix <span><span>textbf{A}in mathbb {R}^{ntimes d}</span><script type=\"math/tex\">textbf{A}in mathbb {R}^{ntimes d}</script></span>, the objective of GCRSS is to select a column submatrix <span><span>textbf{B}_{:,S}in mathbb {R}^{ntimes k}</span><script type=\"math/tex\">textbf{B}_{:,S}in mathbb {R}^{ntimes k}</script></span> from the source matrix <span><span>textbf{B}in mathbb {R}^{ntimes d_B}</span><script type=\"math/tex\">textbf{B}in mathbb {R}^{ntimes d_B}</script></span> and a row submatrix <span><span>textbf{C}_{R,:}in mathbb {R}^{rtimes d}</span><script type=\"math/tex\">textbf{C}_{R,:}in mathbb {R}^{rtimes d}</script></span> from the source matrix <span><span>textbf{C}in mathbb {R}^{n_Ctimes d}</span><script type=\"math/tex\">textbf{C}in mathbb {R}^{n_Ctimes d}</script></span>, such that the residual matrix <span><span>(textbf{I}_n-textbf{B}_{:,S}textbf{B}_{:,S}^{dagger })textbf{A}(textbf{I}_d-textbf{C}_{R,:}^{dagger } textbf{C}_{R,:})</span><script type=\"math/tex\">(textbf{I}_n-textbf{B}_{:,S}textbf{B}_{:,S}^{dagger })textbf{A}(textbf{I}_d-textbf{C}_{R,:}^{dagger } textbf{C}_{R,:})</script></span> has a small spectral norm. By employing the method of interlacing polynomials, we show that the smallest possible spectral norm of a residual matrix can be bounded by the largest root of a related expected characteristic polynomial. A deterministic polynomial time algorithm is provided for the spectral norm case of the GCRSS problem. We next apply our results to two specific GCRSS scenarios, one where <span><span>r=0</span><script type=\"math/tex\">r=0</script></span>, simplifying the problem to the Generalized Column Subset Selection (GCSS) problem, and the other where <span><span>textbf{B}=textbf{C}=textbf{I}_d</span><script type=\"math/tex\">textbf{B}=textbf{C}=textbf{I}_d</script></span>, reducing the problem to the submatrix selection problem. In the GCSS scenario, we connect the expected characteristic polynomials to the convolution of multi-affine polynomials, leading to the derivation of the first provable reconstruction bound on the spectral norm of a residual matrix. In the submatrix selection scenario, we show that for any sufficiently small <span><span>varepsilon >0</span><script type=\"math/tex\">varepsilon >0</script></span> and any square matrix <span><span>textbf{A}in mathbb {R}^{dtimes d}</span><script type=\"math/tex\">textbf{A}in mathbb {R}^{dtimes d}</script></span>, there exist two subsets <span><span>Ssubset [d]</span><script type=\"math/tex\">Ssubset [d]</script></span> and <span><span>Rsubset [d]</span><script type=\"math/tex\">Rsubset [d]</script></span> of sizes <span><span>O(dcdot varepsilon ^2)</span><script type=\"math/tex\">O(dcdot varepsilon ^2)</script></span> such that <span><span>Vert textbf{A}_{S,R}Vert _2le varepsilon cdot Vert textbf{A}Vert _2</span><script type=\"math/tex\">Vert textbf{A}_{S,R}","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"18 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144924529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhengxin Zhang, Ziv Goldfeld, Kristjan Greenewald, Youssef Mroueh, Bharath K. Sriperumbudur
{"title":"Gradient Flows and Riemannian Structure in the Gromov-Wasserstein Geometry","authors":"Zhengxin Zhang, Ziv Goldfeld, Kristjan Greenewald, Youssef Mroueh, Bharath K. Sriperumbudur","doi":"10.1007/s10208-025-09722-w","DOIUrl":"https://doi.org/10.1007/s10208-025-09722-w","url":null,"abstract":"<p>The Wasserstein space of probability measures is known for its intricate Riemannian structure, which underpins the Wasserstein geometry and enables gradient flow algorithms. However, the Wasserstein geometry may not be suitable for certain tasks or data modalities. Motivated by scenarios where the global structure of the data needs to be preserved, this work initiates the study of gradient flows and Riemannian structure in the Gromov-Wasserstein (GW) geometry, which is particularly suited for such purposes. We focus on the inner product GW (IGW) distance between distributions on <span><span>mathbb {R}^d</span><script type=\"math/tex\">mathbb {R}^d</script></span>, which preserves the angles within the data and serves as a convenient initial setting due to its analytic tractability. Given a functional <span><span>textsf{F}:mathcal {P}_2(mathbb {R}^d)rightarrow mathbb {R}</span><script type=\"math/tex\">textsf{F}:mathcal {P}_2(mathbb {R}^d)rightarrow mathbb {R}</script></span> to optimize and an initial distribution <span><span>rho _0in mathcal {P}_2(mathbb {R}^d)</span><script type=\"math/tex\">rho _0in mathcal {P}_2(mathbb {R}^d)</script></span>, we present an implicit IGW minimizing movement scheme that generates a sequence of distributions <span><span>{rho _i}_{i=0}^n</span><script type=\"math/tex\">{rho _i}_{i=0}^n</script></span>, which are close in IGW and aligned in the 2-Wasserstein sense. Taking the time step to zero, we prove that the (piecewise constant interpolation of the) discrete solution converges to an IGW generalized minimizing movement (GMM) <span><span>(rho _t)_t</span><script type=\"math/tex\">(rho _t)_t</script></span> that follows the continuity equation with a velocity field <span><span>v_tin L^2(rho _t;mathbb {R}^d)</span><script type=\"math/tex\">v_tin L^2(rho _t;mathbb {R}^d)</script></span>, specified by a global transformation of the Wasserstein gradient of <span><span>textsf{F}</span><script type=\"math/tex\">textsf{F}</script></span> (viz., the gradient of its first variation). The transformation is given by a mobility operator that modifies the Wasserstein gradient to encode not only local information, but also global structure, as expected for the IGW gradient flow. Our gradient flow analysis leads us to identify the Riemannian structure that gives rise to the intrinsic IGW geometry, using which we establish a Benamou-Brenier-like formula for IGW. We conclude with a formal derivation, akin to the Otto calculus, of the IGW gradient as the inverse mobility acting on the Wasserstein gradient. Numerical experiments demonstrating the global nature of IGW interpolations are provided to complement the theory.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"13 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144924530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}