在线和滑动窗口模型中的近最优线性代数。

Proceedings ... annual Symposium on Foundations of Computer Science. Symposium on Foundations of Computer Science Pub Date : 2020-01-01

Vladimir Braverman, Petros Drineas, Cameron Musco, Christopher Musco, Jalaj Upadhyay, David P Woodruff, Samson Zhou

{"title":"在线和滑动窗口模型中的近最优线性代数。","authors":"Vladimir Braverman, Petros Drineas, Cameron Musco, Christopher Musco, Jalaj Upadhyay, David P Woodruff, Samson Zhou","doi":"","DOIUrl":null,"url":null,"abstract":"We initiate the study of numerical linear algebra in the sliding window model, where only the most recent W updates in a stream form the underlying data set. Although many existing algorithms in the sliding window model use or borrow elements from the smooth histogram framework (Braverman and Ostrovsky, FOCS 2007), we show that many interesting linear-algebraic problems, including spectral and vector induced matrix norms, generalized regression, and lowrank approximation, are not amenable to this approach in the row-arrival model. To overcome this challenge, we first introduce a unified row-sampling based framework that gives randomized algorithms for spectral approximation, low-rank approximation/projection-cost preservation, and ℓ 1-subspace embeddings in the sliding window model, which often use nearly optimal space and achieve nearly input sparsity runtime. Our algorithms are based on \"reverse online\" versions of offline sampling distributions such as (ridge) leverage scores, ℓ 1 sensitivities, and Lewis weights to quantify both the importance and the recency of a row; our structural results on these distributions may be of independent interest for future algorithmic design. Although our techniques initially address numerical linear algebra in the sliding window model, our row-sampling framework rather surprisingly implies connections to the well-studied online model; our structural results also give the first sample optimal (up to lower order terms) online algorithm for low-rank approximation/projection-cost preservation. Using this powerful primitive, we give online algorithms for column/row subset selection and principal component analysis that resolves the main open question of Bhaskara et al. (FOCS 2019). We also give the first online algorithm for ℓ 1-subspace embeddings. We further formalize the connection between the online model and the sliding window model by introducing an additional unified framework for deterministic algorithms using a merge and reduce paradigm and the concept of online coresets, which we define as a weighted subset of rows of the input matrix that can be used to compute a good approximation to some given function on all of its prefixes. Our sampling based algorithms in the row-arrival online model yield online coresets, giving deterministic algorithms for spectral approximation, low-rank approximation/projection-cost preservation, and ℓ 1-subspace embeddings in the sliding window model that use nearly optimal space.","PeriodicalId":93353,"journal":{"name":"Proceedings ... annual Symposium on Foundations of Computer Science. Symposium on Foundations of Computer Science","volume":"1 ","pages":"517-528"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8375632/pdf/nihms-1696963.pdf","citationCount":"0","resultStr":"{\"title\":\"Near Optimal Linear Algebra in the Online and Sliding Window Models.\",\"authors\":\"Vladimir Braverman, Petros Drineas, Cameron Musco, Christopher Musco, Jalaj Upadhyay, David P Woodruff, Samson Zhou\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We initiate the study of numerical linear algebra in the sliding window model, where only the most recent W updates in a stream form the underlying data set. Although many existing algorithms in the sliding window model use or borrow elements from the smooth histogram framework (Braverman and Ostrovsky, FOCS 2007), we show that many interesting linear-algebraic problems, including spectral and vector induced matrix norms, generalized regression, and lowrank approximation, are not amenable to this approach in the row-arrival model. To overcome this challenge, we first introduce a unified row-sampling based framework that gives randomized algorithms for spectral approximation, low-rank approximation/projection-cost preservation, and ℓ 1-subspace embeddings in the sliding window model, which often use nearly optimal space and achieve nearly input sparsity runtime. Our algorithms are based on \\\"reverse online\\\" versions of offline sampling distributions such as (ridge) leverage scores, ℓ 1 sensitivities, and Lewis weights to quantify both the importance and the recency of a row; our structural results on these distributions may be of independent interest for future algorithmic design. Although our techniques initially address numerical linear algebra in the sliding window model, our row-sampling framework rather surprisingly implies connections to the well-studied online model; our structural results also give the first sample optimal (up to lower order terms) online algorithm for low-rank approximation/projection-cost preservation. Using this powerful primitive, we give online algorithms for column/row subset selection and principal component analysis that resolves the main open question of Bhaskara et al. (FOCS 2019). We also give the first online algorithm for ℓ 1-subspace embeddings. We further formalize the connection between the online model and the sliding window model by introducing an additional unified framework for deterministic algorithms using a merge and reduce paradigm and the concept of online coresets, which we define as a weighted subset of rows of the input matrix that can be used to compute a good approximation to some given function on all of its prefixes. Our sampling based algorithms in the row-arrival online model yield online coresets, giving deterministic algorithms for spectral approximation, low-rank approximation/projection-cost preservation, and ℓ 1-subspace embeddings in the sliding window model that use nearly optimal space.\",\"PeriodicalId\":93353,\"journal\":{\"name\":\"Proceedings ... annual Symposium on Foundations of Computer Science. Symposium on Foundations of Computer Science\",\"volume\":\"1 \",\"pages\":\"517-528\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8375632/pdf/nihms-1696963.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings ... annual Symposium on Foundations of Computer Science. Symposium on Foundations of Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings ... annual Symposium on Foundations of Computer Science. Symposium on Foundations of Computer Science","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们在滑动窗口模型中开始了数值线性代数的研究，其中只有流中最近的W更新形成底层数据集。尽管滑动窗口模型中的许多现有算法使用或借用平滑直方图框架中的元素(Braverman和Ostrovsky, FOCS 2007)，但我们表明，许多有趣的线性代数问题，包括谱和矢量诱导矩阵规范、广义回归和低秩近似，都不适用于行到达模型中的这种方法。为了克服这一挑战，我们首先引入了一个统一的基于行采样的框架，该框架给出了谱近似、低秩近似/投影成本保持和滑动窗口模型中的1-子空间嵌入的随机算法，这些算法通常使用近最优空间并实现近输入稀疏运行时。我们的算法基于离线抽样分布的“反向在线”版本，如(ridge)杠杆分数、1灵敏度和Lewis权重，以量化行的重要性和近时性;我们在这些分布上的结构结果可能对未来的算法设计有独立的兴趣。虽然我们的技术最初解决了滑动窗口模型中的数值线性代数，但我们的行采样框架令人惊讶地暗示了与充分研究的在线模型的联系;我们的结构结果也给出了低秩近似/投影成本保存的第一个样本最优(直到低阶项)在线算法。使用这个强大的原语，我们给出了用于列/行子集选择和主成分分析的在线算法，解决了Bhaskara等人(FOCS 2019)的主要开放问题。我们还给出了第一个用于1-子空间嵌入的在线算法。我们进一步形式化了在线模型和滑动窗口模型之间的联系，通过引入使用合并和约简范式的确定性算法的额外统一框架和在线核心集的概念，我们将其定义为输入矩阵行的加权子集，可用于计算给定函数在其所有前缀上的良好近似值。我们在行到达在线模型中基于采样的算法产生在线核心集，为使用近最优空间的滑动窗口模型中的谱近似、低秩近似/投影成本保存和1-子空间嵌入提供确定性算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

本刊更多论文

Near Optimal Linear Algebra in the Online and Sliding Window Models.

We initiate the study of numerical linear algebra in the sliding window model, where only the most recent W updates in a stream form the underlying data set. Although many existing algorithms in the sliding window model use or borrow elements from the smooth histogram framework (Braverman and Ostrovsky, FOCS 2007), we show that many interesting linear-algebraic problems, including spectral and vector induced matrix norms, generalized regression, and lowrank approximation, are not amenable to this approach in the row-arrival model. To overcome this challenge, we first introduce a unified row-sampling based framework that gives randomized algorithms for spectral approximation, low-rank approximation/projection-cost preservation, and ℓ ₁-subspace embeddings in the sliding window model, which often use nearly optimal space and achieve nearly input sparsity runtime. Our algorithms are based on "reverse online" versions of offline sampling distributions such as (ridge) leverage scores, ℓ ₁ sensitivities, and Lewis weights to quantify both the importance and the recency of a row; our structural results on these distributions may be of independent interest for future algorithmic design. Although our techniques initially address numerical linear algebra in the sliding window model, our row-sampling framework rather surprisingly implies connections to the well-studied online model; our structural results also give the first sample optimal (up to lower order terms) online algorithm for low-rank approximation/projection-cost preservation. Using this powerful primitive, we give online algorithms for column/row subset selection and principal component analysis that resolves the main open question of Bhaskara et al. (FOCS 2019). We also give the first online algorithm for ℓ ₁-subspace embeddings. We further formalize the connection between the online model and the sliding window model by introducing an additional unified framework for deterministic algorithms using a merge and reduce paradigm and the concept of online coresets, which we define as a weighted subset of rows of the input matrix that can be used to compute a good approximation to some given function on all of its prefixes. Our sampling based algorithms in the row-arrival online model yield online coresets, giving deterministic algorithms for spectral approximation, low-rank approximation/projection-cost preservation, and ℓ _1-subspace embeddings in the sliding window model that use nearly optimal space.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings ... annual Symposium on Foundations of Computer Science. Symposium on Foundations of Computer Science

自引率

0.00%

发文量