Managing parallelism for stream processing in the cloud

HotCDP '12 Pub Date : 2012-04-10 DOI:10.1145/2169090.2169091
Nathan Backman, Rodrigo Fonseca, U. Çetintemel
{"title":"Managing parallelism for stream processing in the cloud","authors":"Nathan Backman, Rodrigo Fonseca, U. Çetintemel","doi":"10.1145/2169090.2169091","DOIUrl":null,"url":null,"abstract":"Stream processing applications run continuously and have varying load. Cloud infrastructures present an attractive option to meet these fluctuating computational demands. Coordinating such resources to meet end-to-end latency objectives efficiently is important in preventing the frivolous use of cloud resources. We present a framework that parallelizes and schedules workflows of stream operators, in real-time, to meet latency objectives. It supports data- and task-parallel processing of all workflow operators, by all computing nodes, while maintaining the ordering properties of sorted data streams. We show that a latency-oriented operator scheduling policy coupled with the diversification of computing node responsibilities encourages parallelism models that achieve end-to-end latency-minimization goals. We demonstrate the effectiveness of our framework with preliminary experimental results using a variety of real-world applications on heterogeneous clusters.","PeriodicalId":183902,"journal":{"name":"HotCDP '12","volume":"295 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"HotCDP '12","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2169090.2169091","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 39

Abstract

Stream processing applications run continuously and have varying load. Cloud infrastructures present an attractive option to meet these fluctuating computational demands. Coordinating such resources to meet end-to-end latency objectives efficiently is important in preventing the frivolous use of cloud resources. We present a framework that parallelizes and schedules workflows of stream operators, in real-time, to meet latency objectives. It supports data- and task-parallel processing of all workflow operators, by all computing nodes, while maintaining the ordering properties of sorted data streams. We show that a latency-oriented operator scheduling policy coupled with the diversification of computing node responsibilities encourages parallelism models that achieve end-to-end latency-minimization goals. We demonstrate the effectiveness of our framework with preliminary experimental results using a variety of real-world applications on heterogeneous clusters.
管理云中流处理的并行性
流处理应用程序连续运行并具有不同的负载。云基础设施为满足这些波动的计算需求提供了一个有吸引力的选择。协调这些资源以有效地满足端到端延迟目标对于防止对云资源的无谓使用非常重要。我们提出了一个框架,可以实时并行和调度流操作符的工作流,以满足延迟目标。它支持所有计算节点对所有工作流操作符进行数据和任务并行处理,同时保持已排序数据流的排序属性。我们展示了一个面向延迟的操作员调度策略,加上计算节点职责的多样化,鼓励实现端到端延迟最小化目标的并行模型。我们通过在异构集群上使用各种实际应用程序的初步实验结果证明了我们的框架的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信