给出了QR分解、SVD和PCA在数据库连接上的旋转

Dan Olteanu, Nils Vortmeier, Ɖorđe Živanović
{"title":"给出了QR分解、SVD和PCA在数据库连接上的旋转","authors":"Dan Olteanu, Nils Vortmeier, Ɖorđe Živanović","doi":"10.1007/s00778-023-00818-9","DOIUrl":null,"url":null,"abstract":"<p>This article introduces <span>FiGaRo</span>, an algorithm for computing the upper-triangular matrix in the QR decomposition of the matrix defined by the natural join over relational data. <span>FiGaRo</span> ’s main novelty is that it pushes the QR decomposition past the join. This leads to several desirable properties. For acyclic joins, it takes time linear in the database size and independent of the join size. Its execution is equivalent to the application of a sequence of Givens rotations proportional to the join size. Its number of rounding errors relative to the classical QR decomposition algorithms is on par with the database size relative to the join output size. The QR decomposition lies at the core of many linear algebra computations including the singular value decomposition (SVD) and the principal component analysis (PCA). We show how <span>FiGaRo</span> can be used to compute the orthogonal matrix in the QR decomposition, the SVD and the PCA of the join output without the need to materialize the join output. A suite of experiments validate that <span>FiGaRo</span> can outperform both in runtime performance and numerical accuracy the LAPACK library Intel MKL by a factor proportional to the gap between the sizes of the join output and input.\n</p>","PeriodicalId":501532,"journal":{"name":"The VLDB Journal","volume":"31 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Givens rotations for QR decomposition, SVD and PCA over database joins\",\"authors\":\"Dan Olteanu, Nils Vortmeier, Ɖorđe Živanović\",\"doi\":\"10.1007/s00778-023-00818-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>This article introduces <span>FiGaRo</span>, an algorithm for computing the upper-triangular matrix in the QR decomposition of the matrix defined by the natural join over relational data. <span>FiGaRo</span> ’s main novelty is that it pushes the QR decomposition past the join. This leads to several desirable properties. For acyclic joins, it takes time linear in the database size and independent of the join size. Its execution is equivalent to the application of a sequence of Givens rotations proportional to the join size. Its number of rounding errors relative to the classical QR decomposition algorithms is on par with the database size relative to the join output size. The QR decomposition lies at the core of many linear algebra computations including the singular value decomposition (SVD) and the principal component analysis (PCA). We show how <span>FiGaRo</span> can be used to compute the orthogonal matrix in the QR decomposition, the SVD and the PCA of the join output without the need to materialize the join output. A suite of experiments validate that <span>FiGaRo</span> can outperform both in runtime performance and numerical accuracy the LAPACK library Intel MKL by a factor proportional to the gap between the sizes of the join output and input.\\n</p>\",\"PeriodicalId\":501532,\"journal\":{\"name\":\"The VLDB Journal\",\"volume\":\"31 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The VLDB Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s00778-023-00818-9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The VLDB Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00778-023-00818-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

本文介绍了一种在关系数据上由自然连接定义的矩阵QR分解中计算上三角矩阵的算法FiGaRo。FiGaRo的主要新颖之处在于它将QR分解推过了连接。这导致了几个理想的特性。对于非循环连接,它所花费的时间与数据库大小成线性关系,与连接大小无关。它的执行相当于应用一系列与连接大小成比例的Givens旋转。它相对于经典QR分解算法的舍入误差数量与数据库大小相对于连接输出大小的数量相当。QR分解是许多线性代数计算的核心,包括奇异值分解(SVD)和主成分分析(PCA)。我们展示了如何使用FiGaRo来计算QR分解中的正交矩阵、SVD和连接输出的PCA,而不需要具体化连接输出。一组实验验证了FiGaRo在运行时性能和数值精度上都优于LAPACK库Intel MKL,其系数与连接输出和输入大小之间的差距成正比。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Givens rotations for QR decomposition, SVD and PCA over database joins

Givens rotations for QR decomposition, SVD and PCA over database joins

This article introduces FiGaRo, an algorithm for computing the upper-triangular matrix in the QR decomposition of the matrix defined by the natural join over relational data. FiGaRo ’s main novelty is that it pushes the QR decomposition past the join. This leads to several desirable properties. For acyclic joins, it takes time linear in the database size and independent of the join size. Its execution is equivalent to the application of a sequence of Givens rotations proportional to the join size. Its number of rounding errors relative to the classical QR decomposition algorithms is on par with the database size relative to the join output size. The QR decomposition lies at the core of many linear algebra computations including the singular value decomposition (SVD) and the principal component analysis (PCA). We show how FiGaRo can be used to compute the orthogonal matrix in the QR decomposition, the SVD and the PCA of the join output without the need to materialize the join output. A suite of experiments validate that FiGaRo can outperform both in runtime performance and numerical accuracy the LAPACK library Intel MKL by a factor proportional to the gap between the sizes of the join output and input.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信