Streaming Algorithms with Few State Changes

Rajesh Jayaram, David P. Woodruff, Samson Zhou
{"title":"Streaming Algorithms with Few State Changes","authors":"Rajesh Jayaram, David P. Woodruff, Samson Zhou","doi":"10.1145/3651145","DOIUrl":null,"url":null,"abstract":"In this paper, we study streaming algorithms that minimize the number of changes made to their internal state (i.e., memory contents). While the design of streaming algorithms typically focuses on minimizing space and update time, these metrics fail to capture the asymmetric costs, inherent in modern hardware and database systems, of reading versus writing to memory. In fact, most streaming algorithms write to their memory on every update, which is undesirable when writing is significantly more expensive than reading. This raises the question of whether streaming algorithms with small space and number of memory writes are possible.\n \n We first demonstrate that, for the fundamental F\n p\n moment estimation problem with p ≥ 1, any streaming algorithm that achieves a constant factor approximation must make Ω(n\n 1-1/p\n ) internal state changes, regardless of how much space it uses. Perhaps surprisingly, we show that this lower bound can be matched by an algorithm which also has near-optimal space complexity. Specifically, we give a (1+ε)-approximation algorithm for F\n p\n moment estimation that use a near-optimal ~O\n ε\n (n\n 1-1/p\n ) number of state changes, while simultaneously achieving near-optimal space, i.e., for p∈[1,2), our algorithm uses poly(log n,1/ε) bits of space for, while for p>2, the algorithm uses ~O\n ε\n (n\n 1-1/p\n ) space. We similarly design streaming algorithms that are simultaneously near-optimal in both space complexity and the number of state changes for the heavy-hitters problem, sparse support recovery, and entropy estimation. Our results demonstrate that an optimal number of state changes can be achieved without sacrificing space complexity.\n","PeriodicalId":498157,"journal":{"name":"Proceedings of the ACM on Management of Data","volume":" 83","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on Management of Data","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.1145/3651145","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we study streaming algorithms that minimize the number of changes made to their internal state (i.e., memory contents). While the design of streaming algorithms typically focuses on minimizing space and update time, these metrics fail to capture the asymmetric costs, inherent in modern hardware and database systems, of reading versus writing to memory. In fact, most streaming algorithms write to their memory on every update, which is undesirable when writing is significantly more expensive than reading. This raises the question of whether streaming algorithms with small space and number of memory writes are possible. We first demonstrate that, for the fundamental F p moment estimation problem with p ≥ 1, any streaming algorithm that achieves a constant factor approximation must make Ω(n 1-1/p ) internal state changes, regardless of how much space it uses. Perhaps surprisingly, we show that this lower bound can be matched by an algorithm which also has near-optimal space complexity. Specifically, we give a (1+ε)-approximation algorithm for F p moment estimation that use a near-optimal ~O ε (n 1-1/p ) number of state changes, while simultaneously achieving near-optimal space, i.e., for p∈[1,2), our algorithm uses poly(log n,1/ε) bits of space for, while for p>2, the algorithm uses ~O ε (n 1-1/p ) space. We similarly design streaming algorithms that are simultaneously near-optimal in both space complexity and the number of state changes for the heavy-hitters problem, sparse support recovery, and entropy estimation. Our results demonstrate that an optimal number of state changes can be achieved without sacrificing space complexity.
状态变化少的流算法
本文研究的流式算法能最大限度地减少对内部状态(即内存内容)的更改次数。虽然流算法的设计通常侧重于最小化空间和更新时间,但这些指标未能捕捉到现代硬件和数据库系统固有的读取内存与写入内存的不对称成本。事实上,大多数流式算法在每次更新时都会向内存写入数据,当写入数据的成本明显高于读取数据时,这种做法是不可取的。这就提出了一个问题:写入内存的空间和次数较少的流式算法是否可行? 我们首先证明,对于 p ≥ 1 的基本 F p 矩估计问题,无论使用多少空间,任何能实现常数因子逼近的流算法都必须进行 Ω(n 1-1/p ) 内部状态变化。也许令人惊讶的是,我们证明了这种算法也能达到这个下限,而且空间复杂度接近最优。具体来说,我们给出了一种 (1+ε)-Approximation 算法,用于 F p 矩估计,该算法使用了接近最优的 ~O ε (n 1-1/p ) 状态变化次数,同时实现了接近最优的空间,即对于 p∈[1,2], 我们的算法使用了 poly(log n,1/ε) 位空间,而对于 p>2, 该算法使用了 ~O ε (n 1-1/p ) 空间。我们还设计了类似的流算法,这些算法同时在重载问题、稀疏支持恢复和熵估计的空间复杂度和状态变化次数上接近最优。我们的结果表明,可以在不牺牲空间复杂度的情况下实现最佳状态变化次数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信