Run-time operator state spilling for memory intensive long-running queries

B. Liu, Yali Zhu, Elke A. Rundensteiner
{"title":"Run-time operator state spilling for memory intensive long-running queries","authors":"B. Liu, Yali Zhu, Elke A. Rundensteiner","doi":"10.1145/1142473.1142513","DOIUrl":null,"url":null,"abstract":"Main memory is a critical resource when processing long-running queries over data streams with state intensive operators. In this work, we investigate state spill strategies that handle run-time memory shortage when processing such complex queries by selectively pushing operator states into disks. Unlike previous solutions which all focus on one single operator only, we instead target queries with multiple state intensive operators. We observe an interdependency among multiple operators in the query plan when spilling operator states. We illustrate that existing strategies, which do not take account of this interdependency, become largely ineffective in this query context. Clearly, a consolidated plan level spill strategy must be devised to address this problem. Several data spill strategies are proposed in this paper to maximize the run-time query throughput in memory constrained environments. The bottom-up state spill strategy is an operator-level strategy that treats all data in one operator state equally. More sophisticated partition-level data spill strategies are then proposed to take different characteristics of the input data into account, including the local output, the global output and the global output with penalty strategies. All proposed state spill strategies have been implemented in the D-CAPE continuous query system. The experimental results confirm the effectiveness of our proposed strategies. In particular, the global output strategy and the global output with penalty strategy have shown favorable results as compared to the other two more localized strategies.","PeriodicalId":416090,"journal":{"name":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","volume":"655 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"56","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1142473.1142513","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 56

Abstract

Main memory is a critical resource when processing long-running queries over data streams with state intensive operators. In this work, we investigate state spill strategies that handle run-time memory shortage when processing such complex queries by selectively pushing operator states into disks. Unlike previous solutions which all focus on one single operator only, we instead target queries with multiple state intensive operators. We observe an interdependency among multiple operators in the query plan when spilling operator states. We illustrate that existing strategies, which do not take account of this interdependency, become largely ineffective in this query context. Clearly, a consolidated plan level spill strategy must be devised to address this problem. Several data spill strategies are proposed in this paper to maximize the run-time query throughput in memory constrained environments. The bottom-up state spill strategy is an operator-level strategy that treats all data in one operator state equally. More sophisticated partition-level data spill strategies are then proposed to take different characteristics of the input data into account, including the local output, the global output and the global output with penalty strategies. All proposed state spill strategies have been implemented in the D-CAPE continuous query system. The experimental results confirm the effectiveness of our proposed strategies. In particular, the global output strategy and the global output with penalty strategy have shown favorable results as compared to the other two more localized strategies.
用于内存密集型长时间运行查询的运行时操作符状态溢出
在使用状态密集型操作符处理数据流上的长时间查询时,主内存是一种关键资源。在这项工作中,我们研究了通过选择性地将操作符状态推入磁盘来处理此类复杂查询时运行时内存短缺的状态溢出策略。与之前的解决方案只关注一个操作符不同,我们的目标查询具有多个状态密集型操作符。当溢出操作符状态时,我们观察到查询计划中多个操作符之间存在相互依赖关系。我们说明,现有的策略,没有考虑到这种相互依赖性,在这个查询上下文中变得很大程度上无效。显然,必须设计一个统一的计划级泄漏策略来解决这个问题。本文提出了几种数据溢出策略,以最大限度地提高内存受限环境下的运行时查询吞吐量。自底向上的状态泄漏策略是一种操作人员级别的策略,它平等地对待一个操作人员状态下的所有数据。然后提出了更复杂的分区级数据溢出策略,考虑了输入数据的不同特征,包括局部输出、全局输出和具有惩罚策略的全局输出。所有提出的状态泄漏策略都在D-CAPE连续查询系统中得到了实现。实验结果证实了所提策略的有效性。特别是,与其他两种更本地化的策略相比,全球产出策略和全球产出附带惩罚策略显示出良好的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信