SIAM journal on mathematics of data science最新文献_第5页

Efficient Identification of Butterfly Sparse Matrix Factorizations 蝴蝶稀疏矩阵分解的高效识别

SIAM journal on mathematics of data science Pub Date : 2021-10-04 DOI: 10.1137/22m1488727

Léon Zheng, E. Riccietti, R. Gribonval

{"title":"Efficient Identification of Butterfly Sparse Matrix Factorizations","authors":"Léon Zheng, E. Riccietti, R. Gribonval","doi":"10.1137/22m1488727","DOIUrl":"https://doi.org/10.1137/22m1488727","url":null,"abstract":"Fast transforms correspond to factorizations of the form $mathbf{Z} = mathbf{X}^{(1)} ldots mathbf{X}^{(J)}$, where each factor $ mathbf{X}^{(ell)}$ is sparse and possibly structured. This paper investigates essential uniqueness of such factorizations, i.e., uniqueness up to unavoidable scaling ambiguities. Our main contribution is to prove that any $N times N$ matrix having the so-called butterfly structure admits an essentially unique factorization into $J$ butterfly factors (where $N = 2^{J}$), and that the factors can be recovered by a hierarchical factorization method, which consists in recursively factorizing the considered matrix into two factors. This hierarchical identifiability property relies on a simple identifiability condition in the two-layer and fixed-support setting. This approach contrasts with existing ones that fit the product of butterfly factors to a given matrix via gradient descent. The proposed method can be applied in particular to retrieve the factorization of the Hadamard or the discrete Fourier transform matrices of size $N=2^J$. Computing such factorizations costs $mathcal{O}(N^{2})$, which is of the order of dense matrix-vector multiplication, while the obtained factorizations enable fast $mathcal{O}(N log N)$ matrix-vector multiplications and have the potential to be applied to compress deep neural networks.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"50 1","pages":"22-49"},"PeriodicalIF":0.0,"publicationDate":"2021-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75817474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization 重点:计算最优和通信高效的分散非凸有限和优化

SIAM journal on mathematics of data science Pub Date : 2021-10-04 DOI: 10.1137/21m1450677

Boyue Li, Zhize Li, Yuejie Chi

{"title":"DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization","authors":"Boyue Li, Zhize Li, Yuejie Chi","doi":"10.1137/21m1450677","DOIUrl":"https://doi.org/10.1137/21m1450677","url":null,"abstract":"Emerging applications in multi-agent environments such as internet-of-things, networked sensing, autonomous systems and federated learning, call for decentralized algorithms for finite-sum optimizations that are resource-efficient in terms of both computation and communication. In this paper, we consider the prototypical setting where the agents work collaboratively to minimize the sum of local loss functions by only communicating with their neighbors over a predetermined network topology. We develop a new algorithm, called DEcentralized STochastic REcurSive gradient methodS (DESTRESS) for nonconvex finite-sum optimization, which matches the optimal incremental first-order oracle (IFO) complexity of centralized algorithms for finding first-order stationary points, while maintaining communication efficiency. Detailed theoretical and numerical comparisons corroborate that the resource efficiencies of DESTRESS improve upon prior decentralized algorithms over a wide range of parameter regimes. DESTRESS leverages several key algorithm design ideas including randomly activated stochastic recursive gradient updates with mini-batches for local computation, gradient tracking with extra mixing (i.e., multiple gossiping rounds) for per-iteration communication, together with careful choices of hyper-parameters and new analysis frameworks to provably achieve a desirable computation-communication trade-off.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"6 1","pages":"1031-1051"},"PeriodicalIF":0.0,"publicationDate":"2021-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84079619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Local versions of sum-of-norms clustering 规范和聚类的局部版本

SIAM journal on mathematics of data science Pub Date : 2021-09-20 DOI: 10.1137/21m1448732

Alexander Dunlap, J. Mourrat

引用次数: 3

Moving Up the Cluster Tree with the Gradient Flow 用梯度流向上移动集群树

SIAM journal on mathematics of data science Pub Date : 2021-09-17 DOI: 10.1137/22m1469869

E. Arias-Castro, Wanli Qiao

引用次数: 1

Analysis of Spatial and Spatiotemporal Anomalies Using Persistent Homology: Case Studies with COVID-19 Data 基于持续同源性的时空异常分析:以COVID-19数据为例

SIAM journal on mathematics of data science Pub Date : 2021-07-19 DOI: 10.1137/21m1435033

Abigail Hickok, D. Needell, M. A. Porter

引用次数: 6

Intrinsic Dimension Adaptive Partitioning for Kernel Methods 核方法的内维数自适应划分

SIAM journal on mathematics of data science Pub Date : 2021-07-16 DOI: 10.1137/21m1435690

Thomas Hamm, Ingo Steinwart

引用次数: 3

Block Alternating Bregman Majorization Minimization with Extrapolation 块交替布雷格曼最大化最小化与外推

SIAM journal on mathematics of data science Pub Date : 2021-07-09 DOI: 10.1137/21M1432661

L. Hien, D. Phan, Nicolas Gillis, Masoud Ahookhosh, Panagiotis Patrinos

引用次数: 5

A Generalized CUR decomposition for matrix pairs 矩阵对的广义CUR分解

SIAM journal on mathematics of data science Pub Date : 2021-07-07 DOI: 10.1137/21m1432119

Perfect Y. Gidisu, M. Hochstenbach

{"title":"A Generalized CUR decomposition for matrix pairs","authors":"Perfect Y. Gidisu, M. Hochstenbach","doi":"10.1137/21m1432119","DOIUrl":"https://doi.org/10.1137/21m1432119","url":null,"abstract":"We propose a generalized CUR (GCUR) decomposition for matrix pairs (A,B). Given matrices A and B with the same number of columns, such a decomposition provides low-rank approximations of both matrices simultaneously, in terms of some of their rows and columns. We obtain the indices for selecting the subset of rows and columns of the original matrices using the discrete empirical interpolation method (DEIM) on the generalized singular vectors. When B is square and nonsingular, there are close connections between the GCUR of (A,B) and the DEIM-induced CUR of AB−1. When B is the identity, the GCUR decomposition of A coincides with the DEIM-induced CUR decomposition of A. We also show similar connection between the GCUR of (A,B) and the CUR of AB for a nonsquare but full-rank matrix B, where B denotes the Moore–Penrose pseudoinverse of B. While a CUR decomposition acts on one data set, a GCUR factorization jointly decomposes two data sets. The algorithm may be suitable for applications where one is interested in extracting the most discriminative features from one data set relative to another data set. In numerical experiments, we demonstrate the advantages of the new method over the standard CUR approximation; for recovering data perturbed with colored noise and subgroup discovery.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"40 1","pages":"386-409"},"PeriodicalIF":0.0,"publicationDate":"2021-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81598226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

A Generative Variational Model for Inverse Problems in Imaging 成像反问题的生成变分模型

SIAM journal on mathematics of data science Pub Date : 2021-04-26 DOI: 10.1137/21m1414978

Andreas Habring, M. Holler

引用次数: 11

Operator Shifting for General Noisy Matrix Systems 一般有噪声矩阵系统的算子移位

SIAM journal on mathematics of data science Pub Date : 2021-04-22 DOI: 10.1137/21m1416849

Philip A. Etter, Lexing Ying

{"title":"Operator Shifting for General Noisy Matrix Systems","authors":"Philip A. Etter, Lexing Ying","doi":"10.1137/21m1416849","DOIUrl":"https://doi.org/10.1137/21m1416849","url":null,"abstract":". In the computational sciences, one must often estimate model parameters from data subject to noise and uncertainty, leading to inaccurate results. In order to improve the accuracy of models with noisy parameters, we consider the problem of reducing error in a linear system with the operator corrupted by noise. Our contribution in this paper is to extend the elliptic operator shifting framework from Etter, Ying ’20 to the general nonsymmetric matrix case. Roughly, the operator shifting technique is a matrix analogue of the James-Stein estimator. The key insight is that a shift of the matrix inverse estimate in an appropriately chosen direction will reduce average error. In our extension, we interrogate a number of questions — namely, whether or not shifting towards the origin for general matrix inverses always reduces error as it does in the elliptic case. We show that this is usually the case, but that there are three key features of the general nonsingular matrices that allow for adversarial examples not possible in the symmetric case. We prove that when these adversarial possibilities are eliminated by the assumption of noise symmetry and the use of the residual norm as the error metric, the optimal shift is always towards the origin, mirroring results from Etter, Ying ’20. We also investigate behavior in the small noise regime and other scenarios. We conclude by presenting numerical experiments (with accompanying source code) inspired by Reinforcement Learning to demonstrate that operator shifting can yield substantial reductions in error.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42222215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1