Pseudospectral Shattering, the Sign Function, and Diagonalization in Nearly Matrix Multiplication Time

IF 2.5 1区数学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Foundations of Computational Mathematics Pub Date : 2022-08-24 DOI:10.1007/s10208-022-09577-5

Jess Banks, Jorge Garza-Vargas, Archit Kulkarni, Nikhil Srivastava

{"title":"Pseudospectral Shattering, the Sign Function, and Diagonalization in Nearly Matrix Multiplication Time","authors":"Jess Banks, Jorge Garza-Vargas, Archit Kulkarni, Nikhil Srivastava","doi":"10.1007/s10208-022-09577-5","DOIUrl":null,"url":null,"abstract":"We exhibit a randomized algorithm which, given a square matrix \\(A\\in \\mathbb {C}^{n\\times n}\\) with \\(\\Vert A\\Vert \\le 1\\) and \\(\\delta >0\\), computes with high probability an invertible V and diagonal D such that \\( \\Vert A-VDV^{-1}\\Vert \\le \\delta \\) using \\(O(T_\\mathsf {MM}(n)\\log ^2(n/\\delta ))\\) arithmetic operations, in finite arithmetic with \\(O(\\log ^4(n/\\delta )\\log n)\\) bits of precision. The computed similarity V additionally satisfies \\(\\Vert V\\Vert \\Vert V^{-1}\\Vert \\le O(n^{2.5}/\\delta )\\). Here \\(T_\\mathsf {MM}(n)\\) is the number of arithmetic operations required to multiply two \\(n\\times n\\) complex matrices numerically stably, known to satisfy \\(T_\\mathsf {MM}(n)=O(n^{\\omega +\\eta })\\) for every \\(\\eta >0\\) where \\(\\omega \\) is the exponent of matrix multiplication (Demmel et al. in Numer Math 108(1):59–91, 2007). The algorithm is a variant of the spectral bisection algorithm in numerical linear algebra (Beavers Jr. and Denman in Numer Math 21(1-2):143–169, 1974) with a crucial Gaussian perturbation preprocessing step. Our result significantly improves the previously best-known provable running times of \\(O(n^{10}/\\delta ^2)\\) arithmetic operations for diagonalization of general matrices (Armentano et al. in J Eur Math Soc 20(6):1375–1437, 2018) and (with regard to the dependence on n) \\(O(n^3)\\) arithmetic operations for Hermitian matrices (Dekker and Traub in Linear Algebra Appl 4:137–154, 1971). It is the first algorithm to achieve nearly matrix multiplication time for diagonalization in any model of computation (real arithmetic, rational arithmetic, or finite arithmetic), thereby matching the complexity of other dense linear algebra operations such as inversion and QR factorization up to polylogarithmic factors. The proof rests on two new ingredients. (1) We show that adding a small complex Gaussian perturbation to any matrix splits its pseudospectrum into n small well-separated components. In particular, this implies that the eigenvalues of the perturbed matrix have a large minimum gap, a property of independent interest in random matrix theory. (2) We give a rigorous analysis of Roberts’ Newton iteration method (Roberts in Int J Control 32(4):677–687, 1980) for computing the sign function of a matrix in finite arithmetic, itself an open problem in numerical analysis since at least 1986.","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"6 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2022-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foundations of Computational Mathematics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s10208-022-09577-5","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

We exhibit a randomized algorithm which, given a square matrix \(A\in \mathbb {C}^{n\times n}\) with \(\Vert A\Vert \le 1\) and \(\delta >0\), computes with high probability an invertible V and diagonal D such that \( \Vert A-VDV^{-1}\Vert \le \delta \) using \(O(T_\mathsf {MM}(n)\log ^2(n/\delta ))\) arithmetic operations, in finite arithmetic with \(O(\log ^4(n/\delta )\log n)\) bits of precision. The computed similarity V additionally satisfies \(\Vert V\Vert \Vert V^{-1}\Vert \le O(n^{2.5}/\delta )\). Here \(T_\mathsf {MM}(n)\) is the number of arithmetic operations required to multiply two \(n\times n\) complex matrices numerically stably, known to satisfy \(T_\mathsf {MM}(n)=O(n^{\omega +\eta })\) for every \(\eta >0\) where \(\omega \) is the exponent of matrix multiplication (Demmel et al. in Numer Math 108(1):59–91, 2007). The algorithm is a variant of the spectral bisection algorithm in numerical linear algebra (Beavers Jr. and Denman in Numer Math 21(1-2):143–169, 1974) with a crucial Gaussian perturbation preprocessing step. Our result significantly improves the previously best-known provable running times of \(O(n^{10}/\delta ^2)\) arithmetic operations for diagonalization of general matrices (Armentano et al. in J Eur Math Soc 20(6):1375–1437, 2018) and (with regard to the dependence on n) \(O(n^3)\) arithmetic operations for Hermitian matrices (Dekker and Traub in Linear Algebra Appl 4:137–154, 1971). It is the first algorithm to achieve nearly matrix multiplication time for diagonalization in any model of computation (real arithmetic, rational arithmetic, or finite arithmetic), thereby matching the complexity of other dense linear algebra operations such as inversion and QR factorization up to polylogarithmic factors. The proof rests on two new ingredients. (1) We show that adding a small complex Gaussian perturbation to any matrix splits its pseudospectrum into n small well-separated components. In particular, this implies that the eigenvalues of the perturbed matrix have a large minimum gap, a property of independent interest in random matrix theory. (2) We give a rigorous analysis of Roberts’ Newton iteration method (Roberts in Int J Control 32(4):677–687, 1980) for computing the sign function of a matrix in finite arithmetic, itself an open problem in numerical analysis since at least 1986.

Abstract Image

查看原文本刊更多论文

伪谱破碎、符号函数和近矩阵乘法时间的对角化

我们展示了一个随机算法，给定一个具有\(\Vert A\Vert \le 1\)和\(\delta >0\)的方阵\(A\in \mathbb {C}^{n\times n}\)，以高概率计算一个可逆的V和对角线D，使得\( \Vert A-VDV^{-1}\Vert \le \delta \)使用\(O(T_\mathsf {MM}(n)\log ^2(n/\delta ))\)算术运算，在有限算术中具有\(O(\log ^4(n/\delta )\log n)\)位精度。计算出的相似度V还满足\(\Vert V\Vert \Vert V^{-1}\Vert \le O(n^{2.5}/\delta )\)。这里\(T_\mathsf {MM}(n)\)是两个\(n\times n\)复杂矩阵在数值上稳定相乘所需的算术运算次数，已知对于每个\(\eta >0\)满足\(T_\mathsf {MM}(n)=O(n^{\omega +\eta })\)，其中\(\omega \)是矩阵乘法的指数(Demmel et al. in number Math 108(1): 59-91, 2007)。该算法是数值线性代数中光谱平分算法的一个变体(Beavers Jr.和Denman in nummath 21(1-2): 143-169, 1974)，具有关键的高斯摄动预处理步骤。我们的结果显著提高了之前最著名的一般矩阵对角化\(O(n^{10}/\delta ^2)\)算术运算的可证明运行时间(Armentano et al. in J Eur Math Soc 20(6): 1375-1437, 2018)和(关于对n的依赖)赫米矩阵\(O(n^3)\)算术运算(Dekker and Traub in Linear Algebra应用，4:137-154,1971)。它是第一个在任何计算模型(实数算术、有理数算术或有限算术)中实现对角化的近矩阵乘法时间的算法，从而将其他密集线性代数操作(如反转和QR分解)的复杂度匹配到多对数因子。证据来自两种新的成分。(1)我们证明了在任何矩阵中加入一个小的复高斯扰动将其伪谱分解成n个小的分离良好的分量。特别地，这意味着摄动矩阵的特征值具有很大的最小间隙，这是随机矩阵理论中独立感兴趣的性质。(2)我们对Roberts的牛顿迭代法(Roberts in Int J Control 32(4):677 - 687,1980)进行了严格的分析，用于计算有限算法中矩阵的符号函数，这本身至少自1986年以来就是数值分析中的一个开放问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Foundations of Computational Mathematics 数学-计算机：理论方法

CiteScore

6.90

自引率

3.30%

发文量

审稿时长

>12 weeks

期刊介绍： Foundations of Computational Mathematics (FoCM) will publish research and survey papers of the highest quality which further the understanding of the connections between mathematics and computation. The journal aims to promote the exploration of all fundamental issues underlying the creative tension among mathematics, computer science and application areas unencumbered by any external criteria such as the pressure for applications. The journal will thus serve an increasingly important and applicable area of mathematics. The journal hopes to further the understanding of the deep relationships between mathematical theory: analysis, topology, geometry and algebra, and the computational processes as they are evolving in tandem with the modern computer. With its distinguished editorial board selecting papers of the highest quality and interest from the international community, FoCM hopes to influence both mathematics and computation. Relevance to applications will not constitute a requirement for the publication of articles. The journal does not accept code for review however authors who have code/data related to the submission should include a weblink to the repository where the data/code is stored.