一般核矩阵的数据驱动线性复杂度低秩逼近:一种几何方法

IF 1.8 3区 数学 Q1 MATHEMATICS
Difeng Cai, Edmond Chow, Yuanzhe Xi
{"title":"一般核矩阵的数据驱动线性复杂度低秩逼近:一种几何方法","authors":"Difeng Cai, Edmond Chow, Yuanzhe Xi","doi":"10.1002/nla.2519","DOIUrl":null,"url":null,"abstract":"A general, <i>rectangular</i> kernel matrix may be defined as <math altimg=\"urn:x-wiley:nla:media:nla2519:nla2519-math-0001\" display=\"inline\" location=\"graphic/nla2519-math-0001.png\" overflow=\"scroll\">\n<semantics>\n<mrow>\n<msub>\n<mrow>\n<mi>K</mi>\n</mrow>\n<mrow>\n<mi>i</mi>\n<mi>j</mi>\n</mrow>\n</msub>\n<mo>=</mo>\n<mi>κ</mi>\n<mo stretchy=\"false\">(</mo>\n<msub>\n<mrow>\n<mi>x</mi>\n</mrow>\n<mrow>\n<mi>i</mi>\n</mrow>\n</msub>\n<mo>,</mo>\n<msub>\n<mrow>\n<mi>y</mi>\n</mrow>\n<mrow>\n<mi>j</mi>\n</mrow>\n</msub>\n<mo stretchy=\"false\">)</mo>\n</mrow>\n$$ {K}_{ij}=\\kappa \\left({x}_i,{y}_j\\right) $$</annotation>\n</semantics></math> where <math altimg=\"urn:x-wiley:nla:media:nla2519:nla2519-math-0002\" display=\"inline\" location=\"graphic/nla2519-math-0002.png\" overflow=\"scroll\">\n<semantics>\n<mrow>\n<mi>κ</mi>\n<mo stretchy=\"false\">(</mo>\n<mi>x</mi>\n<mo>,</mo>\n<mi>y</mi>\n<mo stretchy=\"false\">)</mo>\n</mrow>\n$$ \\kappa \\left(x,y\\right) $$</annotation>\n</semantics></math> is a kernel function and where <math altimg=\"urn:x-wiley:nla:media:nla2519:nla2519-math-0003\" display=\"inline\" location=\"graphic/nla2519-math-0003.png\" overflow=\"scroll\">\n<semantics>\n<mrow>\n<mi>X</mi>\n<mo>=</mo>\n<msubsup>\n<mrow>\n<mo stretchy=\"false\">{</mo>\n<msub>\n<mrow>\n<mi>x</mi>\n</mrow>\n<mrow>\n<mi>i</mi>\n</mrow>\n</msub>\n<mo stretchy=\"false\">}</mo>\n</mrow>\n<mrow>\n<mi>i</mi>\n<mo>=</mo>\n<mn>1</mn>\n</mrow>\n<mrow>\n<mi>m</mi>\n</mrow>\n</msubsup>\n</mrow>\n$$ X={\\left\\{{x}_i\\right\\}}_{i=1}^m $$</annotation>\n</semantics></math> and <math altimg=\"urn:x-wiley:nla:media:nla2519:nla2519-math-0004\" display=\"inline\" location=\"graphic/nla2519-math-0004.png\" overflow=\"scroll\">\n<semantics>\n<mrow>\n<mi>Y</mi>\n<mo>=</mo>\n<msubsup>\n<mrow>\n<mo stretchy=\"false\">{</mo>\n<msub>\n<mrow>\n<mi>y</mi>\n</mrow>\n<mrow>\n<mi>i</mi>\n</mrow>\n</msub>\n<mo stretchy=\"false\">}</mo>\n</mrow>\n<mrow>\n<mi>i</mi>\n<mo>=</mo>\n<mn>1</mn>\n</mrow>\n<mrow>\n<mi>n</mi>\n</mrow>\n</msubsup>\n</mrow>\n$$ Y={\\left\\{{y}_i\\right\\}}_{i=1}^n $$</annotation>\n</semantics></math> are two sets of points. In this paper, we seek a low-rank approximation to a kernel matrix where the sets of points <math altimg=\"urn:x-wiley:nla:media:nla2519:nla2519-math-0005\" display=\"inline\" location=\"graphic/nla2519-math-0005.png\" overflow=\"scroll\">\n<semantics>\n<mrow>\n<mi>X</mi>\n</mrow>\n$$ X $$</annotation>\n</semantics></math> and <math altimg=\"urn:x-wiley:nla:media:nla2519:nla2519-math-0006\" display=\"inline\" location=\"graphic/nla2519-math-0006.png\" overflow=\"scroll\">\n<semantics>\n<mrow>\n<mi>Y</mi>\n</mrow>\n$$ Y $$</annotation>\n</semantics></math> are large and are arbitrarily distributed, such as away from each other, “intermingled”, identical, and so forth. Such rectangular kernel matrices may arise, for example, in Gaussian process regression where <math altimg=\"urn:x-wiley:nla:media:nla2519:nla2519-math-0007\" display=\"inline\" location=\"graphic/nla2519-math-0007.png\" overflow=\"scroll\">\n<semantics>\n<mrow>\n<mi>X</mi>\n</mrow>\n$$ X $$</annotation>\n</semantics></math> corresponds to the training data and <math altimg=\"urn:x-wiley:nla:media:nla2519:nla2519-math-0008\" display=\"inline\" location=\"graphic/nla2519-math-0008.png\" overflow=\"scroll\">\n<semantics>\n<mrow>\n<mi>Y</mi>\n</mrow>\n$$ Y $$</annotation>\n</semantics></math> corresponds to the test data. In this case, the points are often high-dimensional. Since the point sets are large, we must exploit the fact that the matrix arises from a kernel function, and avoid forming the matrix, and thus ruling out most algebraic techniques. In particular, we seek methods that can scale linearly or nearly linearly with respect to the size of data for a fixed approximation rank. The main idea in this paper is to <i>geometrically</i> select appropriate subsets of points to construct a low rank approximation. An analysis in this paper guides how this selection should be performed.","PeriodicalId":49731,"journal":{"name":"Numerical Linear Algebra with Applications","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2023-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Data-driven linear complexity low-rank approximation of general kernel matrices: A geometric approach\",\"authors\":\"Difeng Cai, Edmond Chow, Yuanzhe Xi\",\"doi\":\"10.1002/nla.2519\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A general, <i>rectangular</i> kernel matrix may be defined as <math altimg=\\\"urn:x-wiley:nla:media:nla2519:nla2519-math-0001\\\" display=\\\"inline\\\" location=\\\"graphic/nla2519-math-0001.png\\\" overflow=\\\"scroll\\\">\\n<semantics>\\n<mrow>\\n<msub>\\n<mrow>\\n<mi>K</mi>\\n</mrow>\\n<mrow>\\n<mi>i</mi>\\n<mi>j</mi>\\n</mrow>\\n</msub>\\n<mo>=</mo>\\n<mi>κ</mi>\\n<mo stretchy=\\\"false\\\">(</mo>\\n<msub>\\n<mrow>\\n<mi>x</mi>\\n</mrow>\\n<mrow>\\n<mi>i</mi>\\n</mrow>\\n</msub>\\n<mo>,</mo>\\n<msub>\\n<mrow>\\n<mi>y</mi>\\n</mrow>\\n<mrow>\\n<mi>j</mi>\\n</mrow>\\n</msub>\\n<mo stretchy=\\\"false\\\">)</mo>\\n</mrow>\\n$$ {K}_{ij}=\\\\kappa \\\\left({x}_i,{y}_j\\\\right) $$</annotation>\\n</semantics></math> where <math altimg=\\\"urn:x-wiley:nla:media:nla2519:nla2519-math-0002\\\" display=\\\"inline\\\" location=\\\"graphic/nla2519-math-0002.png\\\" overflow=\\\"scroll\\\">\\n<semantics>\\n<mrow>\\n<mi>κ</mi>\\n<mo stretchy=\\\"false\\\">(</mo>\\n<mi>x</mi>\\n<mo>,</mo>\\n<mi>y</mi>\\n<mo stretchy=\\\"false\\\">)</mo>\\n</mrow>\\n$$ \\\\kappa \\\\left(x,y\\\\right) $$</annotation>\\n</semantics></math> is a kernel function and where <math altimg=\\\"urn:x-wiley:nla:media:nla2519:nla2519-math-0003\\\" display=\\\"inline\\\" location=\\\"graphic/nla2519-math-0003.png\\\" overflow=\\\"scroll\\\">\\n<semantics>\\n<mrow>\\n<mi>X</mi>\\n<mo>=</mo>\\n<msubsup>\\n<mrow>\\n<mo stretchy=\\\"false\\\">{</mo>\\n<msub>\\n<mrow>\\n<mi>x</mi>\\n</mrow>\\n<mrow>\\n<mi>i</mi>\\n</mrow>\\n</msub>\\n<mo stretchy=\\\"false\\\">}</mo>\\n</mrow>\\n<mrow>\\n<mi>i</mi>\\n<mo>=</mo>\\n<mn>1</mn>\\n</mrow>\\n<mrow>\\n<mi>m</mi>\\n</mrow>\\n</msubsup>\\n</mrow>\\n$$ X={\\\\left\\\\{{x}_i\\\\right\\\\}}_{i=1}^m $$</annotation>\\n</semantics></math> and <math altimg=\\\"urn:x-wiley:nla:media:nla2519:nla2519-math-0004\\\" display=\\\"inline\\\" location=\\\"graphic/nla2519-math-0004.png\\\" overflow=\\\"scroll\\\">\\n<semantics>\\n<mrow>\\n<mi>Y</mi>\\n<mo>=</mo>\\n<msubsup>\\n<mrow>\\n<mo stretchy=\\\"false\\\">{</mo>\\n<msub>\\n<mrow>\\n<mi>y</mi>\\n</mrow>\\n<mrow>\\n<mi>i</mi>\\n</mrow>\\n</msub>\\n<mo stretchy=\\\"false\\\">}</mo>\\n</mrow>\\n<mrow>\\n<mi>i</mi>\\n<mo>=</mo>\\n<mn>1</mn>\\n</mrow>\\n<mrow>\\n<mi>n</mi>\\n</mrow>\\n</msubsup>\\n</mrow>\\n$$ Y={\\\\left\\\\{{y}_i\\\\right\\\\}}_{i=1}^n $$</annotation>\\n</semantics></math> are two sets of points. In this paper, we seek a low-rank approximation to a kernel matrix where the sets of points <math altimg=\\\"urn:x-wiley:nla:media:nla2519:nla2519-math-0005\\\" display=\\\"inline\\\" location=\\\"graphic/nla2519-math-0005.png\\\" overflow=\\\"scroll\\\">\\n<semantics>\\n<mrow>\\n<mi>X</mi>\\n</mrow>\\n$$ X $$</annotation>\\n</semantics></math> and <math altimg=\\\"urn:x-wiley:nla:media:nla2519:nla2519-math-0006\\\" display=\\\"inline\\\" location=\\\"graphic/nla2519-math-0006.png\\\" overflow=\\\"scroll\\\">\\n<semantics>\\n<mrow>\\n<mi>Y</mi>\\n</mrow>\\n$$ Y $$</annotation>\\n</semantics></math> are large and are arbitrarily distributed, such as away from each other, “intermingled”, identical, and so forth. Such rectangular kernel matrices may arise, for example, in Gaussian process regression where <math altimg=\\\"urn:x-wiley:nla:media:nla2519:nla2519-math-0007\\\" display=\\\"inline\\\" location=\\\"graphic/nla2519-math-0007.png\\\" overflow=\\\"scroll\\\">\\n<semantics>\\n<mrow>\\n<mi>X</mi>\\n</mrow>\\n$$ X $$</annotation>\\n</semantics></math> corresponds to the training data and <math altimg=\\\"urn:x-wiley:nla:media:nla2519:nla2519-math-0008\\\" display=\\\"inline\\\" location=\\\"graphic/nla2519-math-0008.png\\\" overflow=\\\"scroll\\\">\\n<semantics>\\n<mrow>\\n<mi>Y</mi>\\n</mrow>\\n$$ Y $$</annotation>\\n</semantics></math> corresponds to the test data. In this case, the points are often high-dimensional. Since the point sets are large, we must exploit the fact that the matrix arises from a kernel function, and avoid forming the matrix, and thus ruling out most algebraic techniques. In particular, we seek methods that can scale linearly or nearly linearly with respect to the size of data for a fixed approximation rank. The main idea in this paper is to <i>geometrically</i> select appropriate subsets of points to construct a low rank approximation. An analysis in this paper guides how this selection should be performed.\",\"PeriodicalId\":49731,\"journal\":{\"name\":\"Numerical Linear Algebra with Applications\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2023-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Numerical Linear Algebra with Applications\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1002/nla.2519\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Numerical Linear Algebra with Applications","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1002/nla.2519","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 2

摘要

一般的矩形核矩阵可以定义为Kij=κ(xi,yj) $$ {K}_{ij}=\kappa \left({x}_i,{y}_j\right) $$,其中κ(x,y) $$ \kappa \left(x,y\right) $$是一个核函数,其中x ={xii}=1m $$ X={\left\{{x}_i\right\}}_{i=1}^m $$和y ={yii}=1n $$ Y={\left\{{y}_i\right\}}_{i=1}^n $$是两组点。在本文中,我们寻求一个核矩阵的低秩近似,其中点X $$ X $$和Y $$ Y $$的集合很大并且是任意分布的,例如彼此远离,“混合”,相同,等等。例如,在高斯过程回归中可能会出现这样的矩形核矩阵,其中X $$ X $$对应训练数据,Y $$ Y $$对应测试数据。在这种情况下,点通常是高维的。由于点集很大,我们必须利用矩阵由核函数产生的事实,避免形成矩阵,从而排除了大多数代数技术。特别是,我们寻求的方法,可以线性或接近线性缩放相对于固定的近似秩的数据的大小。本文的主要思想是几何地选择适当的点子集来构造低秩逼近。本文的分析指导了如何进行这种选择。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Data-driven linear complexity low-rank approximation of general kernel matrices: A geometric approach
A general, rectangular kernel matrix may be defined as K i j = κ ( x i , y j ) $$ {K}_{ij}=\kappa \left({x}_i,{y}_j\right) $$ where κ ( x , y ) $$ \kappa \left(x,y\right) $$ is a kernel function and where X = { x i } i = 1 m $$ X={\left\{{x}_i\right\}}_{i=1}^m $$ and Y = { y i } i = 1 n $$ Y={\left\{{y}_i\right\}}_{i=1}^n $$ are two sets of points. In this paper, we seek a low-rank approximation to a kernel matrix where the sets of points X $$ X $$ and Y $$ Y $$ are large and are arbitrarily distributed, such as away from each other, “intermingled”, identical, and so forth. Such rectangular kernel matrices may arise, for example, in Gaussian process regression where X $$ X $$ corresponds to the training data and Y $$ Y $$ corresponds to the test data. In this case, the points are often high-dimensional. Since the point sets are large, we must exploit the fact that the matrix arises from a kernel function, and avoid forming the matrix, and thus ruling out most algebraic techniques. In particular, we seek methods that can scale linearly or nearly linearly with respect to the size of data for a fixed approximation rank. The main idea in this paper is to geometrically select appropriate subsets of points to construct a low rank approximation. An analysis in this paper guides how this selection should be performed.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.40
自引率
2.30%
发文量
50
审稿时长
12 months
期刊介绍: Manuscripts submitted to Numerical Linear Algebra with Applications should include large-scale broad-interest applications in which challenging computational results are integral to the approach investigated and analysed. Manuscripts that, in the Editor’s view, do not satisfy these conditions will not be accepted for review. Numerical Linear Algebra with Applications receives submissions in areas that address developing, analysing and applying linear algebra algorithms for solving problems arising in multilinear (tensor) algebra, in statistics, such as Markov Chains, as well as in deterministic and stochastic modelling of large-scale networks, algorithm development, performance analysis or related computational aspects. Topics covered include: Standard and Generalized Conjugate Gradients, Multigrid and Other Iterative Methods; Preconditioning Methods; Direct Solution Methods; Numerical Methods for Eigenproblems; Newton-like Methods for Nonlinear Equations; Parallel and Vectorizable Algorithms in Numerical Linear Algebra; Application of Methods of Numerical Linear Algebra in Science, Engineering and Economics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信