New Characterizations in Turnstile Streams with Applications

Cybersecurity and Cyberforensics Conference Pub Date : 2016-05-29 DOI:10.4230/LIPIcs.CCC.2016.20

Yuqing Ai, Wei Hu, Yi Li, David P. Woodruff

{"title":"New Characterizations in Turnstile Streams with Applications","authors":"Yuqing Ai, Wei Hu, Yi Li, David P. Woodruff","doi":"10.4230/LIPIcs.CCC.2016.20","DOIUrl":null,"url":null,"abstract":"Recently, [Li, Nguyen, Woodruff, STOC'2014] showed any 1-pass constant probability streaming algorithm for computing a relation f on a vector x ∈ {−m, − (m − 1), ..., m}n presented in the turnstile data stream model can be implemented by maintaining a linear sketch A · × mod q, where A is an r × n integer matrix and q = (q1, ..., qr) is a vector of positive integers. The space complexity of maintaining A · × mod q, not including the random bits used for sampling A and q, matches the space of the optimal algorithm. \n \nWe give multiple strengthenings of this reduction, together with new applications. In particular, we show how to remove the following shortcomings of their reduction: \n \n1. The Box Constraint. Their reduction applies only to algorithms that must be correct even if ∥;x∥;∞ = maxi∈[n] |xi| is allowed to be much larger than m at intermediate points in the stream, provided that x ∈ {−m, −(m − 1), ..., m}n at the end of the stream. We give a condition under which the optimal algorithm is a linear sketch even if it works only when promised that x ∈ {−m, −(m − 1), ..., m}n at all points in the stream. Using this, we show the first super-constant Ω(log m) bits lower bound for the problem of maintaining a counter up to an additive em error in a turnstile stream, where e is any constant in (0, ½). Previous lower bounds are based on communication complexity and are only for relative error approximation; interestingly, we do not know how to prove our result using communication complexity. More generally, we show the first super-constant Ω(log m) lower bound for additive approximation of ep-norms; this bound is tight for 1 ≤ p ≤ 2. \n \n2. Negative Coordinates. Their reduction allows xi to be negative while processing the stream. We show an equivalence between 1-pass algorithms and linear sketches A · x mod q in dynamic graph streams, or more generally, the strict turnstile model, in which for all i ∈ [n], xi ≥ 0 at all points in the stream. Combined with [Assadi, Khanna, Li, Yaroslavtsev, SODA'2016], this resolves the 1-pass space complexity of approximating the maximum matching in a dynamic graph stream, answering a question in that work. \n \n3. 1-Pass Restriction. Their reduction only applies to 1-pass data stream algorithms in the turnstile model, while there exist algorithms for heavy hitters and for low rank approximation which provably do better with multiple passes. We extend the reduction to algorithms which make any number of passes, showing the optimal algorithm is to choose a new linear sketch at the beginning of each pass, based on the output of previous passes.","PeriodicalId":246506,"journal":{"name":"Cybersecurity and Cyberforensics Conference","volume":"126 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cybersecurity and Cyberforensics Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.CCC.2016.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 37

Abstract

Recently, [Li, Nguyen, Woodruff, STOC'2014] showed any 1-pass constant probability streaming algorithm for computing a relation f on a vector x ∈ {−m, − (m − 1), ..., m}n presented in the turnstile data stream model can be implemented by maintaining a linear sketch A · × mod q, where A is an r × n integer matrix and q = (q1, ..., qr) is a vector of positive integers. The space complexity of maintaining A · × mod q, not including the random bits used for sampling A and q, matches the space of the optimal algorithm. We give multiple strengthenings of this reduction, together with new applications. In particular, we show how to remove the following shortcomings of their reduction: 1. The Box Constraint. Their reduction applies only to algorithms that must be correct even if ∥;x∥;∞ = maxi∈[n] |xi| is allowed to be much larger than m at intermediate points in the stream, provided that x ∈ {−m, −(m − 1), ..., m}n at the end of the stream. We give a condition under which the optimal algorithm is a linear sketch even if it works only when promised that x ∈ {−m, −(m − 1), ..., m}n at all points in the stream. Using this, we show the first super-constant Ω(log m) bits lower bound for the problem of maintaining a counter up to an additive em error in a turnstile stream, where e is any constant in (0, ½). Previous lower bounds are based on communication complexity and are only for relative error approximation; interestingly, we do not know how to prove our result using communication complexity. More generally, we show the first super-constant Ω(log m) lower bound for additive approximation of ep-norms; this bound is tight for 1 ≤ p ≤ 2. 2. Negative Coordinates. Their reduction allows xi to be negative while processing the stream. We show an equivalence between 1-pass algorithms and linear sketches A · x mod q in dynamic graph streams, or more generally, the strict turnstile model, in which for all i ∈ [n], xi ≥ 0 at all points in the stream. Combined with [Assadi, Khanna, Li, Yaroslavtsev, SODA'2016], this resolves the 1-pass space complexity of approximating the maximum matching in a dynamic graph stream, answering a question in that work. 3. 1-Pass Restriction. Their reduction only applies to 1-pass data stream algorithms in the turnstile model, while there exist algorithms for heavy hitters and for low rank approximation which provably do better with multiple passes. We extend the reduction to algorithms which make any number of passes, showing the optimal algorithm is to choose a new linear sketch at the beginning of each pass, based on the output of previous passes.

查看原文本刊更多论文

旋转门流的新特性及其应用

最近，[Li, Nguyen, Woodruff, STOC'2014]展示了任意1次常概率流算法，用于计算向量x∈{−m，−(m−1)，…， m}n可以通过保持线性草图a·x mod q来实现，其中a是一个r × n整数矩阵，q = (q1，…， qr)是一个正整数向量。维持A·x mod q的空间复杂度(不包括采样A和q所使用的随机比特)与最优算法的空间匹配。我们给出了这种减少的多重强化，以及新的应用。特别是，我们展示了如何消除其还原的以下缺点:1。框约束。它们的简化只适用于即使∥;x∥;∞= maxi∈[n] |xi|在流的中间点允许远大于m的算法，前提是x∈{−m，−(m−1)，…， m}n在流的末尾。我们给出了一个条件，在这个条件下，最优算法是一个线性草图，即使它只在x∈{−m，−(m−1)，…， m}n在流的所有点。使用它，我们展示了第一个超常数Ω(log m)位下界，用于在旋转门流中维持计数器到加性em误差的问题，其中e是(0,1 / 2)中的任何常数。以前的下界是基于通信复杂度的，仅用于相对误差近似;有趣的是，我们不知道如何用通信复杂度来证明我们的结果。更一般地，我们给出了ep-范数加性逼近的第一个超常数Ω(log m)下界;当1≤p≤2时，这个界是紧的。2. 消极的坐标。它们的减少使得xi在处理流时为负。我们证明了动态图流中的1-pass算法与线性草图A·x mod q之间的等价性，或者更一般地说，严格的旋转门模型，其中对于所有i∈[n]， xi在流中的所有点上都≥0。结合[Assadi, Khanna, Li, Yaroslavtsev, SODA'2016]，这解决了在动态图流中近似最大匹配的1次空间复杂性，回答了该工作中的一个问题。3.1-Pass限制。他们的简化只适用于转门模型中的1次数据流算法，而存在用于重击者和低秩近似的算法，这些算法可以证明在多次通过时做得更好。我们将约简扩展到任意数量的传递算法，显示最优算法是在每次传递开始时选择一个新的线性草图，基于之前传递的输出。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Cybersecurity and Cyberforensics Conference

自引率

0.00%

发文量