{"title":"A Theory of the NEPv Approach for Optimization on the Stiefel Manifold","authors":"Ren-Cang Li","doi":"10.1007/s10208-024-09687-2","DOIUrl":"https://doi.org/10.1007/s10208-024-09687-2","url":null,"abstract":"<p>The NEPv approach has been increasingly used lately for optimization on the Stiefel manifold arising from machine learning. General speaking, the approach first turns the first order optimality condition into a nonlinear eigenvalue problem with eigenvector dependency (NEPv) and then solve the nonlinear problem via some variations of the self-consistent-field (SCF) iteration. The difficulty, however, lies in designing a proper SCF iteration so that a maximizer is found at the end. Currently, each use of the approach is very much individualized, especially in its convergence analysis phase to show that the approach does work or otherwise. Related, the NPDo approach is recently proposed for the sum of coupled traces and it seeks to turn the first order optimality condition into a nonlinear polar decomposition with orthogonal factor dependency (NPDo). In this paper, two unifying frameworks are established, one for each approach. Each framework is built upon a basic assumption, under which globally convergence to a stationary point is guaranteed and during the SCF iterative process that leads to the stationary point, the objective function increases monotonically. Also the notion of atomic function for each approach is proposed, and the atomic functions include commonly used matrix traces of linear and quadratic forms as special ones. It is shown that the basic assumptions of the approaches are satisfied by their respective atomic functions and, more importantly, by convex compositions of their respective atomic functions. Together they provide a large collection of objectives for which either one of approaches or both are guaranteed to work, respectively.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"67 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142562122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Explicit A Posteriori Error Representation for Variational Problems and Application to TV-Minimization","authors":"Sören Bartels, Alex Kaltenbach","doi":"10.1007/s10208-024-09676-5","DOIUrl":"https://doi.org/10.1007/s10208-024-09676-5","url":null,"abstract":"<p>In this paper, we propose a general approach for explicit <i>a posteriori</i> error representation for convex minimization problems using basic convex duality relations. Exploiting discrete orthogonality relations in the space of element-wise constant vector fields as well as a discrete integration-by-parts formula between the Crouzeix–Raviart and the Raviart–Thomas element, all convex duality relations are transferred to a discrete level, making the explicit <i>a posteriori</i> error representation –initially based on continuous arguments only– practicable from a numerical point of view. In addition, we provide a generalized Marini formula that determines a discrete primal solution in terms of a given discrete dual solution. We benchmark all these concepts via the Rudin–Osher–Fatemi model. This leads to an adaptive algorithm that yields a (quasi-optimal) linear convergence rate.\u0000</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"20 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142449554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shreya Arya, Arnab Auddy, Ranthony A. Clark, Sunhyuk Lim, Facundo Mémoli, Daniel Packer
{"title":"The Gromov–Wasserstein Distance Between Spheres","authors":"Shreya Arya, Arnab Auddy, Ranthony A. Clark, Sunhyuk Lim, Facundo Mémoli, Daniel Packer","doi":"10.1007/s10208-024-09678-3","DOIUrl":"https://doi.org/10.1007/s10208-024-09678-3","url":null,"abstract":"<p>The Gromov–Wasserstein distance—a generalization of the usual Wasserstein distance—permits comparing probability measures defined on possibly different metric spaces. Recently, this notion of distance has found several applications in Data Science and in Machine Learning. With the goal of aiding both the interpretability of dissimilarity measures computed through the Gromov–Wasserstein distance and the assessment of the approximation quality of computational techniques designed to estimate the Gromov–Wasserstein distance, we determine the precise value of a certain variant of the Gromov–Wasserstein distance between unit spheres of different dimensions. Indeed, we consider a two-parameter family <span>({d_{{{text {GW}}}p,q}}_{p,q=1}^{infty })</span> of Gromov–Wasserstein distances between metric measure spaces. By exploiting a suitable interaction between specific values of the parameters <i>p</i> and <i>q</i> and the metric of the underlying spaces, we are able to determine the exact value of the distance <span>(d_{{{text {GW}}}4,2})</span> between all pairs of unit spheres of different dimensions endowed with their Euclidean distance and their uniform measure.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"13 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142235260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unbiasing Hamiltonian Monte Carlo Algorithms for a General Hamiltonian Function","authors":"T. Lelièvre, R. Santet, G. Stoltz","doi":"10.1007/s10208-024-09677-4","DOIUrl":"https://doi.org/10.1007/s10208-024-09677-4","url":null,"abstract":"<p>Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo method that allows to sample high dimensional probability measures. It relies on the integration of the Hamiltonian dynamics to propose a move which is then accepted or rejected thanks to a Metropolis procedure. Unbiased sampling is guaranteed by the preservation by the numerical integrators of two key properties of the Hamiltonian dynamics: volume-preservation and reversibility up to momentum reversal. For separable Hamiltonian functions, some standard explicit numerical schemes, such as the Störmer–Verlet integrator, satisfy these properties. However, for numerical or physical reasons, one may consider a Hamiltonian function which is nonseparable, in which case the standard numerical schemes which preserve the volume and satisfy reversibility up to momentum reversal are implicit. When implemented in practice, such implicit schemes may admit many solutions or none, especially when the timestep is too large. We show here how to enforce the numerical reversibility, and thus unbiasedness, of HMC schemes in this context by introducing a reversibility check. In addition, for some specific forms of the Hamiltonian function, we discuss the consistency of these HMC schemes with some Langevin dynamics, and show in particular that our algorithm yields an efficient discretization of the metropolized overdamped Langevin dynamics with position-dependent diffusion coefficients. Numerical results illustrate the relevance of the reversibility check on simple problems.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"185 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142235261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Magnus Bakke Botnan, Steffen Oppermann, Steve Oudot
{"title":"Signed Barcodes for Multi-parameter Persistence via Rank Decompositions and Rank-Exact Resolutions","authors":"Magnus Bakke Botnan, Steffen Oppermann, Steve Oudot","doi":"10.1007/s10208-024-09672-9","DOIUrl":"https://doi.org/10.1007/s10208-024-09672-9","url":null,"abstract":"<p>In this paper, we introduce the signed barcode, a new visual representation of the global structure of the rank invariant of a multi-parameter persistence module or, more generally, of a poset representation. Like its unsigned counterpart in one-parameter persistence, the signed barcode decomposes the rank invariant as a <span>({mathbb {Z}})</span>-linear combination of rank invariants of indicator modules supported on segments in the poset. We develop the theory behind these decompositions, both for the usual rank invariant and for its generalizations, showing under what conditions they exist and are unique. We also show that, like its unsigned counterpart, the signed barcode reflects in part the algebraic structure of the module: specifically, it derives from the terms in the minimal rank-exact resolution of the module, i.e., its minimal projective resolution relative to the class of short exact sequences on which the rank invariant is additive. To complete the picture, we show some experimental results that illustrate the contribution of the signed barcode in the exploration of multi-parameter persistence modules.\u0000</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"6 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142138385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Olaf Parczyk, Sebastian Pokutta, Christoph Spiegel, Tibor Szabó
{"title":"New Ramsey Multiplicity Bounds and Search Heuristics","authors":"Olaf Parczyk, Sebastian Pokutta, Christoph Spiegel, Tibor Szabó","doi":"10.1007/s10208-024-09675-6","DOIUrl":"https://doi.org/10.1007/s10208-024-09675-6","url":null,"abstract":"<p>We study two related problems concerning the number of homogeneous subsets of given size in graphs that go back to questions of Erdős. Most notably, we improve the upper bounds on the Ramsey multiplicity of <span>(K_4)</span> and <span>(K_5)</span> and settle the minimum number of independent sets of size 4 in graphs with clique number at most 4. Motivated by the elusiveness of the symmetric Ramsey multiplicity problem, we also introduce an off-diagonal variant and obtain tight results when counting monochromatic <span>(K_4)</span> or <span>(K_5)</span> in only one of the colors and triangles in the other. The extremal constructions for each problem turn out to be blow-ups of a graph of constant size and were found through search heuristics. They are complemented by lower bounds established using flag algebras, resulting in a fully computer-assisted approach. For some of our theorems we can also derive that the extremal construction is stable in a very strong sense. More broadly, these problems lead us to the study of the region of possible pairs of clique and independent set densities that can be realized as the limit of some sequence of graphs.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"2 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142084945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Chaplin, Heather A. Harrington, Ulrike Tillmann
{"title":"Grounded Persistent Path Homology: A Stable, Topological Descriptor for Weighted Digraphs","authors":"Thomas Chaplin, Heather A. Harrington, Ulrike Tillmann","doi":"10.1007/s10208-024-09679-2","DOIUrl":"https://doi.org/10.1007/s10208-024-09679-2","url":null,"abstract":"<p>Weighted digraphs are used to model a variety of natural systems and can exhibit interesting structure across a range of scales. In order to understand and compare these systems, we require stable, interpretable, multiscale descriptors. To this end, we propose grounded persistent path homology (<span>GrPPH</span>)—a new, functorial, topological descriptor that describes the structure of an edge-weighted digraph via a persistence barcode. We show there is a choice of circuit basis for the graph which yields geometrically interpretable representatives for the features in the barcode. Moreover, we show the barcode is stable, in bottleneck distance, to both numerical and structural perturbations.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"88 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142045624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Time-Scales in Two-Layers Neural Networks","authors":"Raphaël Berthier, Andrea Montanari, Kangjie Zhou","doi":"10.1007/s10208-024-09664-9","DOIUrl":"https://doi.org/10.1007/s10208-024-09664-9","url":null,"abstract":"<p>Gradient-based learning in multi-layer neural networks displays a number of striking features. In particular, the decrease rate of empirical risk is non-monotone even after averaging over large batches. Long plateaus in which one observes barely any progress alternate with intervals of rapid decrease. These successive phases of learning often take place on very different time scales. Finally, models learnt in an early phase are typically ‘simpler’ or ‘easier to learn’ although in a way that is difficult to formalize. Although theoretical explanations of these phenomena have been put forward, each of them captures at best certain specific regimes. In this paper, we study the gradient flow dynamics of a wide two-layer neural network in high-dimension, when data are distributed according to a single-index model (i.e., the target function depends on a one-dimensional projection of the covariates). Based on a mixture of new rigorous results, non-rigorous mathematical derivations, and numerical simulations, we propose a scenario for the learning dynamics in this setting. In particular, the proposed evolution exhibits separation of timescales and intermittency. These behaviors arise naturally because the population gradient flow can be recast as a singularly perturbed dynamical system.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"17 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142042699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Universal Equivariance Properties of Exotic Aromatic B-Series","authors":"Adrien Laurent, Hans Munthe-Kaas","doi":"10.1007/s10208-024-09668-5","DOIUrl":"https://doi.org/10.1007/s10208-024-09668-5","url":null,"abstract":"<p>The exotic aromatic Butcher series were originally introduced for the calculation of order conditions for the high order numerical integration of ergodic stochastic differential equations in <span>(mathbb {R} ^d)</span> and on manifolds. We prove in this paper that exotic aromatic B-series satisfy a universal geometric property, namely that they are characterised by locality and equivariance with respect to orthogonal changes of coordinates. This characterisation confirms that exotic aromatic B-series are a fundamental geometric object that naturally generalises aromatic B-series and B-series, as they share similar equivariance properties. In addition, we provide a classification of the main subsets of the exotic aromatic B-series, in particular the exotic B-series, using different equivariance properties. Along the analysis, we present a generalised definition of exotic aromatic trees, dual vector fields, and we explore the impact of degeneracies on the classification.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"31 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141994403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Approximations of Dispersive PDEs in the Presence of Low-Regularity Randomness","authors":"Yvonne Alama Bronsard, Yvain Bruned, Katharina Schratz","doi":"10.1007/s10208-023-09625-8","DOIUrl":"https://doi.org/10.1007/s10208-023-09625-8","url":null,"abstract":"<p>We introduce a new class of numerical schemes which allow for low-regularity approximations to the expectation <span>( mathbb {E}(|u_{k}(t, v^{eta })|^2))</span>, where <span>(u_k)</span> denotes the <i>k</i>-th Fourier coefficient of the solution <i>u</i> of the dispersive equation and <span>( v^{eta }(x) )</span> the associated random initial data. This quantity plays an important role in physics, in particular in the study of wave turbulence where one needs to adopt a statistical approach in order to obtain deep insight into the <i>generic</i> long-time behaviour of solutions to dispersive equations. Our new class of schemes is based on Wick’s theorem and Feynman diagrams together with a resonance-based discretisation (Bruned and Schratz in Forum Math Pi 10:E2, 2022) set in a more general context: we introduce a novel combinatorial structure called paired decorated forests which are two decorated trees whose decorations on the leaves come in pair. The character of the scheme draws its inspiration from the treatment of singular stochastic partial differential equations via regularity structures. In contrast to classical approaches, we do not discretise the PDE itself, but rather its expectation. This allows us to heavily exploit the optimal resonance structure and underlying gain in regularity on the finite dimensional (discrete) level.</p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"16 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141992016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}