Proactive scaling of distributed stream processing work flows using workload modelling: doctoral symposium

Thomas Cooper
{"title":"Proactive scaling of distributed stream processing work flows using workload modelling: doctoral symposium","authors":"Thomas Cooper","doi":"10.1145/2933267.2933429","DOIUrl":null,"url":null,"abstract":"In recent years there has been significant development in the area of distributed stream processing systems (DSPS) such as Apache Storm, Spark, and Flink. These systems allow complex queries on streaming data to be distributed across multiple worker nodes in a cluster. DSPS often provide the tools to add/remove resources and take advantage of cloud infrastructure to scale their operation. However, the decisions behind this are generally left to the administrators of these systems. There have been several studies focused on finding optimal operator deployments of DSPS operators across a cluster. However, these systems often do not optimise with regard to a given Service Level Agreement (SLA) and where they do, they do not take incoming workload into account. To our knowledge there has been little or no work based around proactively scaling the DSPS with regard to incoming workload in order to maintain SLAs. This PhD will focus on predicting incoming workloads using time series analysis. In order to assess whether a given predicted workload will breach a SLA the response of a DSPS work flow to incoming workload will be modelled using a queuing theory approach. The intention is to build a system that can tune the parameters of this queuing theoretic model, using output metrics such as end-to-end latency and throughput, as the DSPS is running. The end result will be a system that can identify potential SLA breaches before they happen, and initiate a proactive scaling response. Initially, Apache Storm will be used as the test DSPS, however it is anticipated that the system developed during this PhD will be applicable to other DSPS that use a graph-based description of the streaming work flow e.g. Apache Spark and Flink.","PeriodicalId":277061,"journal":{"name":"Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2933267.2933429","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

In recent years there has been significant development in the area of distributed stream processing systems (DSPS) such as Apache Storm, Spark, and Flink. These systems allow complex queries on streaming data to be distributed across multiple worker nodes in a cluster. DSPS often provide the tools to add/remove resources and take advantage of cloud infrastructure to scale their operation. However, the decisions behind this are generally left to the administrators of these systems. There have been several studies focused on finding optimal operator deployments of DSPS operators across a cluster. However, these systems often do not optimise with regard to a given Service Level Agreement (SLA) and where they do, they do not take incoming workload into account. To our knowledge there has been little or no work based around proactively scaling the DSPS with regard to incoming workload in order to maintain SLAs. This PhD will focus on predicting incoming workloads using time series analysis. In order to assess whether a given predicted workload will breach a SLA the response of a DSPS work flow to incoming workload will be modelled using a queuing theory approach. The intention is to build a system that can tune the parameters of this queuing theoretic model, using output metrics such as end-to-end latency and throughput, as the DSPS is running. The end result will be a system that can identify potential SLA breaches before they happen, and initiate a proactive scaling response. Initially, Apache Storm will be used as the test DSPS, however it is anticipated that the system developed during this PhD will be applicable to other DSPS that use a graph-based description of the streaming work flow e.g. Apache Spark and Flink.
使用工作量建模的分布式流处理工作流的主动扩展:博士研讨会
近年来,分布式流处理系统(DSPS)如Apache Storm、Spark和Flink等领域有了显著的发展。这些系统允许对流数据的复杂查询分布在集群中的多个工作节点上。dsp通常提供工具来添加/删除资源,并利用云基础设施来扩展其操作。但是,这背后的决策通常留给这些系统的管理员。已经有一些研究集中在寻找跨集群的DSPS运营商的最佳运营商部署。然而,这些系统通常不会针对给定的服务水平协议(SLA)进行优化,即使进行了优化,也不会考虑传入的工作负载。据我们所知,为了维护sla,很少或根本没有针对传入工作负载主动扩展dsp的工作。本博士将专注于使用时间序列分析预测传入的工作负载。为了评估给定的预测工作负载是否会违反SLA,将使用排队论方法对DSPS工作流对传入工作负载的响应进行建模。目的是构建一个系统,该系统可以在dsp运行时使用端到端延迟和吞吐量等输出指标来调优这个排队理论模型的参数。最终的结果将是一个系统,它可以在潜在的SLA违规发生之前识别出来,并启动一个主动的扩展响应。最初,Apache Storm将被用作测试dsp,但预计在本博士期间开发的系统将适用于其他使用基于图形的流工作流程描述的dsp,例如Apache Spark和Flink。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信