Hung-Hsu Chou , Carsten Gieshoff , Johannes Maly , Holger Rauhut
{"title":"Gradient descent for deep matrix factorization: Dynamics and implicit bias towards low rank","authors":"Hung-Hsu Chou , Carsten Gieshoff , Johannes Maly , Holger Rauhut","doi":"10.1016/j.acha.2023.101595","DOIUrl":"https://doi.org/10.1016/j.acha.2023.101595","url":null,"abstract":"<div><p>In deep learning<span>, it is common to use more network parameters than training points. In such scenario of over-parameterization, there are usually multiple networks that achieve zero training error so that the training algorithm induces an implicit bias on the computed solution. In practice, (stochastic) gradient descent tends to prefer solutions which generalize well, which provides a possible explanation of the success of deep learning. In this paper we analyze the dynamics of gradient descent in the simplified setting of linear networks and of an estimation problem. Although we are not in an overparameterized scenario, our analysis nevertheless provides insights into the phenomenon of implicit bias. In fact, we derive a rigorous analysis of the dynamics of vanilla gradient descent, and characterize the dynamical convergence of the spectrum. We are able to accurately locate time intervals where the effective rank of the iterates is close to the effective rank of a low-rank projection of the ground-truth matrix. In practice, those intervals can be used as criteria for early stopping if a certain regularity is desired. We also provide empirical evidence for implicit bias in more general scenarios, such as matrix sensing and random initialization. This suggests that deep learning prefers trajectories whose complexity (measured in terms of effective rank) is monotonically increasing, which we believe is a fundamental concept for the theoretical understanding of deep learning.</span></p></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"68 ","pages":"Article 101595"},"PeriodicalIF":2.5,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49778391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finite alphabet phase retrieval","authors":"Tamir Bendory, Dan Edidin, Ivan Gonzalez","doi":"10.1016/j.acha.2023.04.005","DOIUrl":"https://doi.org/10.1016/j.acha.2023.04.005","url":null,"abstract":"<div><p>We consider the finite alphabet phase retrieval problem: recovering a signal whose entries lie in a small alphabet of possible values from its Fourier magnitudes. This problem arises in the celebrated technology of X-ray crystallography to determine the atomic structure of biological molecules. Our main result states that for generic values of the alphabet, two signals have the same Fourier magnitudes if and only if several partitions have the same difference sets. Thus, the finite alphabet phase retrieval problem reduces to the combinatorial problem of determining a signal from those difference sets. Notably, this result holds true when one of the letters of the alphabet is zero, namely, for sparse signals with finite alphabet, which is the situation in X-ray crystallography.</p></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"66 ","pages":"Pages 151-160"},"PeriodicalIF":2.5,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49726863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Near-optimal bounds for generalized orthogonal Procrustes problem via generalized power method","authors":"Shuyang Ling","doi":"10.1016/j.acha.2023.04.008","DOIUrl":"https://doi.org/10.1016/j.acha.2023.04.008","url":null,"abstract":"<div><p><span><span>Given multiple point clouds, how to find the rigid transform (rotation, reflection, and shifting) such that these point clouds are well aligned? This problem, known as the generalized orthogonal Procrustes problem (GOPP), has found numerous applications in statistics<span>, computer vision, and imaging science. While one commonly-used method is finding the least squares estimator, it is generally an NP-hard problem to obtain the least squares estimator exactly due to the notorious nonconvexity. In this work, we apply the semidefinite programming (SDP) relaxation and the generalized power method to solve this generalized orthogonal Procrustes problem. In particular, we assume the data are generated from a signal-plus-noise model: each observed point cloud is a noisy copy of the same unknown point cloud transformed by an unknown </span></span>orthogonal matrix<span> and also corrupted by additive Gaussian noise. We show that the generalized power method (equivalently alternating minimization algorithm) with spectral initialization converges to the unique global optimum to the SDP relaxation, provided that the signal-to-noise ratio is high. Moreover, this limiting point is exactly the least squares estimator and also the maximum likelihood estimator. Our theoretical bound is near-optimal in terms of the information-theoretic limit (only loose by a factor of the dimension and a log factor). Our results significantly improve the state-of-the-art results on the tightness of the SDP relaxation for the generalized orthogonal Procrustes problem, an open problem posed by Bandeira et al. (2014) </span></span><span>[8]</span>.</p></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"66 ","pages":"Pages 62-100"},"PeriodicalIF":2.5,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49726663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Decentralized learning over a network with Nyström approximation using SGD","authors":"Heng Lian , Jiamin Liu","doi":"10.1016/j.acha.2023.06.005","DOIUrl":"10.1016/j.acha.2023.06.005","url":null,"abstract":"<div><p>Nowadays we often meet with a learning problem when data are distributed on different machines connected via a network, instead of stored centrally. Here we consider decentralized supervised learning in a reproducing kernel Hilbert space<span>. We note that standard gradient descent in a reproducing kernel Hilbert space is difficult to implement with multiple communications between worker machines. On the other hand, the Nyström approximation using gradient descent is more suited for the decentralized setting since only a small number of data points need to be shared at the beginning of the algorithm. In the setting of decentralized distributed learning in a reproducing kernel Hilbert space, we establish the optimal learning rate of stochastic gradient descent based on mini-batches, allowing multiple passes over the data set. The proposal provides a scalable approach to nonparametric estimation combining gradient method, distributed estimation, and random projection.</span></p></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"66 ","pages":"Pages 373-387"},"PeriodicalIF":2.5,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44914607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jieren Xu , Yitong Li , Haizhao Yang , David Dunson , Ingrid Daubechies
{"title":"PiPs: A kernel-based optimization scheme for analyzing non-stationary 1D signals","authors":"Jieren Xu , Yitong Li , Haizhao Yang , David Dunson , Ingrid Daubechies","doi":"10.1016/j.acha.2023.04.002","DOIUrl":"10.1016/j.acha.2023.04.002","url":null,"abstract":"<div><p>This paper proposes a novel kernel-based optimization scheme to handle tasks in the analysis, <em>e.g.</em>, signal spectral estimation and single-channel source separation of 1D non-stationary oscillatory data. The key insight of our optimization scheme for reconstructing the time-frequency information is that when a nonparametric regression is applied on some input values, the output regressed points would lie near the oscillatory pattern of the oscillatory 1D signal only if these input values are a good approximation of the ground-truth phase function. In this work, <em>Gaussian Process (GP)</em> is chosen to conduct this nonparametric regression: the oscillatory pattern is encoded as the <em>Pattern-inducing Points (PiPs)</em> which act as the training data points in the GP regression; while the targeted phase function is fed in to compute the correlation kernels, acting as the testing input. Better approximated phase function generates more precise kernels, thus resulting in smaller optimization loss error when comparing the kernel-based regression output with the original signals. To the best of our knowledge, this is the first algorithm that can satisfactorily handle fully non-stationary oscillatory data, close and crossover frequencies, and general oscillatory patterns. Even in the example of a signal produced by slow variation in the parameters of a trigonometric expansion, we show that PiPs admits competitive or better performance in terms of accuracy and robustness than existing state-of-the-art algorithms.</p></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"66 ","pages":"Pages 1-17"},"PeriodicalIF":2.5,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43227419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cullen A. Haselby , Mark A. Iwen , Deanna Needell , Michael Perlmutter , Elizaveta Rebrova
{"title":"Modewise operators, the tensor restricted isometry property, and low-rank tensor recovery","authors":"Cullen A. Haselby , Mark A. Iwen , Deanna Needell , Michael Perlmutter , Elizaveta Rebrova","doi":"10.1016/j.acha.2023.04.007","DOIUrl":"https://doi.org/10.1016/j.acha.2023.04.007","url":null,"abstract":"<div><p><span>Recovery of sparse vectors and low-rank matrices from a small number of linear measurements is well-known to be possible under various model assumptions on the measurements. The key requirement on the measurement matrices is typically the restricted isometry property, that is, approximate </span>orthonormality when acting on the subspace to be recovered. Among the most widely used random matrix measurement models are (a) independent subgaussian models and (b) randomized Fourier-based models, allowing for the efficient computation of the measurements.</p><p>For the now ubiquitous tensor data, direct application of the known recovery algorithms to the vectorized or matricized tensor is memory-heavy because of the huge measurement matrices to be constructed and stored. In this paper, we propose modewise measurement schemes based on subgaussian and randomized Fourier measurements. These modewise operators act on the pairs or other small subsets of the tensor modes separately. They require significantly less memory than the measurements working on the vectorized tensor, provably satisfy the tensor restricted isometry property and experimentally can recover the tensor data from fewer measurements and do not require impractical storage.</p></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"66 ","pages":"Pages 161-192"},"PeriodicalIF":2.5,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49726899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Aguilera , C. Cabrelli , D. Carbajal , V. Paternostro
{"title":"Frames by orbits of two operators that commute","authors":"A. Aguilera , C. Cabrelli , D. Carbajal , V. Paternostro","doi":"10.1016/j.acha.2023.04.006","DOIUrl":"https://doi.org/10.1016/j.acha.2023.04.006","url":null,"abstract":"<div><p><span>Frames formed by orbits of vectors through the iteration of a bounded operator<span> have recently attracted considerable attention, in particular due to its applications to dynamical sampling. In this article, we consider two commuting bounded operators acting on some separable Hilbert space </span></span><span><math><mi>H</mi></math></span>. We completely characterize operators <em>T</em> and <em>L</em> with <span><math><mi>T</mi><mi>L</mi><mo>=</mo><mi>L</mi><mi>T</mi></math></span> and sets <span><math><mi>Φ</mi><mo>⊂</mo><mi>H</mi></math></span> such that the collection <span><math><mo>{</mo><msup><mrow><mi>T</mi></mrow><mrow><mi>k</mi></mrow></msup><msup><mrow><mi>L</mi></mrow><mrow><mi>j</mi></mrow></msup><mi>ϕ</mi><mo>:</mo><mi>k</mi><mo>∈</mo><mi>Z</mi><mo>,</mo><mi>j</mi><mo>∈</mo><mi>J</mi><mo>,</mo><mi>ϕ</mi><mo>∈</mo><mi>Φ</mi><mo>}</mo></math></span> forms a frame of <span><math><mi>H</mi></math></span><span><span>. This is done in terms of model subspaces of the space of square integrable functions defined on the torus and having values in some </span>Hardy space with multiplicity. The operators acting on these models are the bilateral shift and the compression of the unilateral shift (acting pointwisely). This context includes the case when the Hilbert space </span><span><math><mi>H</mi></math></span> is a subspace of <span><math><msup><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>(</mo><mi>R</mi><mo>)</mo></math></span>, invariant under translations along the integers, where the operator <em>T</em> is the translation by one and <em>L</em> is a shift-preserving operator.</p></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"66 ","pages":"Pages 46-61"},"PeriodicalIF":2.5,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49754458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A fast procedure for the construction of quadrature formulas for bandlimited functions","authors":"A. Gopal , V. Rokhlin","doi":"10.1016/j.acha.2023.05.001","DOIUrl":"https://doi.org/10.1016/j.acha.2023.05.001","url":null,"abstract":"<div><p><span>We introduce an efficient scheme for the construction of quadrature rules for bandlimited functions. While the scheme is predominantly based on well-known facts about prolate spheroidal wave functions of order zero, it has the asymptotic CPU time estimate </span><span><math><mi>O</mi><mo>(</mo><mi>n</mi><mi>log</mi><mo></mo><mi>n</mi><mo>)</mo></math></span> to construct an <em>n</em>-point quadrature rule. Moreover, the size of the “<span><math><mi>n</mi><mi>log</mi><mo></mo><mi>n</mi></math></span>” term in the CPU time estimate is small, so for all practical purposes the CPU time cost is proportional to <em>n</em>. The performance of the algorithm is illustrated by several numerical examples.</p></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"66 ","pages":"Pages 193-210"},"PeriodicalIF":2.5,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49738502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Afonso S. Bandeira , Ben Blum-Smith , Joe Kileel , Jonathan Niles-Weed , Amelia Perry , Alexander S. Wein
{"title":"Estimation under group actions: Recovering orbits from invariants","authors":"Afonso S. Bandeira , Ben Blum-Smith , Joe Kileel , Jonathan Niles-Weed , Amelia Perry , Alexander S. Wein","doi":"10.1016/j.acha.2023.06.001","DOIUrl":"https://doi.org/10.1016/j.acha.2023.06.001","url":null,"abstract":"<div><p>We study a class of <em>orbit recovery</em> problems in which we observe independent copies of an unknown element of <span><math><msup><mrow><mi>R</mi></mrow><mrow><mi>p</mi></mrow></msup></math></span>, each linearly acted upon by a random element of some group (such as <span><math><mi>Z</mi><mo>/</mo><mi>p</mi></math></span> or <span><math><mrow><mi>SO</mi></mrow><mo>(</mo><mn>3</mn><mo>)</mo></math></span><span>) and then corrupted by additive Gaussian noise. We prove matching upper and lower bounds on the number of samples required to approximately recover the group orbit of this unknown element with high probability. These bounds, based on quantitative techniques in invariant theory, give a precise correspondence between the statistical difficulty of the estimation problem and algebraic properties of the group. Furthermore, we give computer-assisted procedures to certify these properties that are computationally efficient in many cases of interest.</span></p><p>The model is motivated by geometric problems in signal processing, computer vision, and structural biology, and applies to the reconstruction problem in cryo-electron microscopy (cryo-EM), a problem of significant practical interest. Our results allow us to verify (for a given problem size) that if cryo-EM images are corrupted by noise with variance <span><math><msup><mrow><mi>σ</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>, the number of images required to recover the molecule structure scales as <span><math><msup><mrow><mi>σ</mi></mrow><mrow><mn>6</mn></mrow></msup></math></span>. We match this bound with a novel (albeit computationally expensive) algorithm for <em>ab initio</em> reconstruction in cryo-EM, based on invariant features of degree at most 3. We further discuss how to recover multiple molecular structures from mixed (or heterogeneous) cryo-EM samples.</p></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"66 ","pages":"Pages 236-319"},"PeriodicalIF":2.5,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49738509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zunwei Fu , Loukas Grafakos , Yan Lin , Yue Wu , Shuhui Yang
{"title":"Riesz transform associated with the fractional Fourier transform and applications in image edge detection","authors":"Zunwei Fu , Loukas Grafakos , Yan Lin , Yue Wu , Shuhui Yang","doi":"10.1016/j.acha.2023.05.003","DOIUrl":"https://doi.org/10.1016/j.acha.2023.05.003","url":null,"abstract":"<div><p><span>The fractional Hilbert transform was introduced by Zayed </span><span>[30, Zayed, 1998]</span><span> and has been widely used in signal processing. In view of its connection with the fractional Fourier transform, Chen, the first, second and fourth authors of this paper in </span><span>[6, Chen et al., 2021]</span><span><span><span> studied the fractional Hilbert transform and other fractional multiplier operators on the real line. The present paper is concerned with a natural extension of the fractional Hilbert transform to higher dimensions: this extension is the fractional Riesz transform and is given by multiplication which a suitable chirp function on the fractional Fourier transform side. In addition to a thorough study of the fractional Riesz transform, in this work we also investigate the </span>boundedness<span> of singular integral operators<span> with chirp functions on rotation invariant spaces, chirp </span></span></span>Hardy spaces<span><span> and their relation to chirp BMO spaces, as well as applications of the theory of fractional multipliers in partial differential equations. Through numerical simulation, we provide physical and </span>geometric interpretations of high-dimensional fractional multipliers. Finally, we present an application of the fractional Riesz transforms in edge detection which verifies a hypothesis insinuated in </span></span><span>[26, Xu et al., 2016]</span>. In fact our numerical implementation confirms that amplitude, phase, and direction information can be simultaneously extracted by controlling the order of the fractional Riesz transform.</p></div>","PeriodicalId":55504,"journal":{"name":"Applied and Computational Harmonic Analysis","volume":"66 ","pages":"Pages 211-235"},"PeriodicalIF":2.5,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49738503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}