FlatFIT: Accelerated Incremental Sliding-Window Aggregation For Real-Time Analytics

Anatoli U. Shein, Panos K. Chrysanthis, Alexandros Labrinidis
{"title":"FlatFIT: Accelerated Incremental Sliding-Window Aggregation For Real-Time Analytics","authors":"Anatoli U. Shein, Panos K. Chrysanthis, Alexandros Labrinidis","doi":"10.1145/3085504.3085509","DOIUrl":null,"url":null,"abstract":"Data stream processing is becoming essential in most current advanced scientific or business applications as data production rates are increasing. Different companies compete to efficiently ingest high velocity data and apply some form of computation in order to make better business decisions. In order to successfully compete in this environment, companies are focusing on the most recent data within a count or time-based window by continuously executing aggregate queries on it. Incremental sliding-window computation is commonly used to avoid the performance implications of re-evaluating the aggregate value of the window from scratch on every update. The state-of-the-art FlatFAT technique executes ACQs with high efficiency but it does not scale well with the increasing workloads. In this paper we propose a novel algorithm, FlatFIT, that accelerates such calculations by intelligently maintaining index structures, leading to higher reuse of intermediate calculations and thus exceptional scalability in systems with heavy workloads. Our theoretical analysis shows that FlatFIT is superior in both time and space complexities compared to FlatFAT, while maintaining the same query generality. Given a window of size n, FlatFIT achieves constant algorithmic complexity compared to O(log(n)) complexity of FlatFAT. We experimentally show that FlatFIT achieves up to a 17x throughput improvement over FlatFAT for the same input workload while using less memory.","PeriodicalId":431308,"journal":{"name":"Proceedings of the 29th International Conference on Scientific and Statistical Database Management","volume":"298 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 29th International Conference on Scientific and Statistical Database Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3085504.3085509","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24

Abstract

Data stream processing is becoming essential in most current advanced scientific or business applications as data production rates are increasing. Different companies compete to efficiently ingest high velocity data and apply some form of computation in order to make better business decisions. In order to successfully compete in this environment, companies are focusing on the most recent data within a count or time-based window by continuously executing aggregate queries on it. Incremental sliding-window computation is commonly used to avoid the performance implications of re-evaluating the aggregate value of the window from scratch on every update. The state-of-the-art FlatFAT technique executes ACQs with high efficiency but it does not scale well with the increasing workloads. In this paper we propose a novel algorithm, FlatFIT, that accelerates such calculations by intelligently maintaining index structures, leading to higher reuse of intermediate calculations and thus exceptional scalability in systems with heavy workloads. Our theoretical analysis shows that FlatFIT is superior in both time and space complexities compared to FlatFAT, while maintaining the same query generality. Given a window of size n, FlatFIT achieves constant algorithmic complexity compared to O(log(n)) complexity of FlatFAT. We experimentally show that FlatFIT achieves up to a 17x throughput improvement over FlatFAT for the same input workload while using less memory.
FlatFIT:加速增量滑动窗口聚合实时分析
随着数据产生速率的增加,数据流处理在当前大多数先进的科学或商业应用中变得至关重要。不同的公司竞相高效地获取高速数据,并应用某种形式的计算,以便做出更好的业务决策。为了在这种环境中成功竞争,公司通过不断地对其执行聚合查询来关注计数或基于时间的窗口内的最新数据。增量滑动窗口计算通常用于避免在每次更新时从头开始重新评估窗口的聚合值所带来的性能影响。最先进的FlatFAT技术以高效率执行acq,但它不能很好地随工作负载的增加而扩展。在本文中,我们提出了一种新的算法FlatFIT,它通过智能地维护索引结构来加速这种计算,从而导致中间计算的更高重用,从而在具有繁重工作负载的系统中具有出色的可伸缩性。我们的理论分析表明,与FlatFAT相比,FlatFIT在时间和空间复杂性方面都优于FlatFAT,同时保持相同的查询通用性。给定大小为n的窗口,FlatFIT实现恒定的算法复杂度,而FlatFAT的复杂度为O(log(n))。我们通过实验证明,在使用更少内存的情况下,对于相同的输入工作负载,FlatFIT比FlatFAT实现了高达17倍的吞吐量改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信