Online Nonstop Task Management for Storm-Based Distributed Stream Processing Engines

IF 1.2 3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Zhou Zhang, Pei-Quan Jin, Xi-Ke Xie, Xiao-Liang Wang, Rui-Cheng Liu, Shou-Hong Wan
{"title":"Online Nonstop Task Management for Storm-Based Distributed Stream Processing Engines","authors":"Zhou Zhang, Pei-Quan Jin, Xi-Ke Xie, Xiao-Liang Wang, Rui-Cheng Liu, Shou-Hong Wan","doi":"10.1007/s11390-021-1629-9","DOIUrl":null,"url":null,"abstract":"<p>Most distributed stream processing engines (DSPEs) do not support online task management and cannot adapt to time-varying data flows. Recently, some studies have proposed online task deployment algorithms to solve this problem. However, these approaches do not guarantee the Quality of Service (QoS) when the task deployment changes at runtime, because the task migrations caused by the change of task deployments will impose an exorbitant cost. We study one of the most popular DSPEs, Apache Storm, and find out that when a task needs to be migrated, Storm has to stop the resource (implemented as a process of Worker in Storm) where the task is deployed. This will lead to the stop and restart of all tasks in the resource, resulting in the poor performance of task migrations. Aiming to solve this problem, in this paper, we propose N-Storm (Nonstop Storm), which is a task-resource decoupling DSPE. N-Storm allows tasks allocated to resources to be changed at runtime, which is implemented by a thread-level scheme for task migrations. Particularly, we add a local shared key/value store on each node to make resources aware of the changes in the allocation plan. Thus, each resource can manage its tasks at runtime. Based on N-Storm, we further propose Online Task Deployment (OTD). Differing from traditional task deployment algorithms that deploy all tasks at once without considering the cost of task migrations caused by a task re-deployment, OTD can gradually adjust the current task deployment to an optimized one based on the communication cost and the runtime states of resources. We demonstrate that OTD can adapt to different kinds of applications including computation- and communication-intensive applications. The experimental results on a real DSPE cluster show that N-Storm can avoid the system stop and save up to 87% of the performance degradation time, compared with Apache Storm and other state-of-the-art approaches. In addition, OTD can increase the average CPU usage by 51% for computation-intensive applications and reduce network communication costs by 88% for communication-intensive applications.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Science and Technology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11390-021-1629-9","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Most distributed stream processing engines (DSPEs) do not support online task management and cannot adapt to time-varying data flows. Recently, some studies have proposed online task deployment algorithms to solve this problem. However, these approaches do not guarantee the Quality of Service (QoS) when the task deployment changes at runtime, because the task migrations caused by the change of task deployments will impose an exorbitant cost. We study one of the most popular DSPEs, Apache Storm, and find out that when a task needs to be migrated, Storm has to stop the resource (implemented as a process of Worker in Storm) where the task is deployed. This will lead to the stop and restart of all tasks in the resource, resulting in the poor performance of task migrations. Aiming to solve this problem, in this paper, we propose N-Storm (Nonstop Storm), which is a task-resource decoupling DSPE. N-Storm allows tasks allocated to resources to be changed at runtime, which is implemented by a thread-level scheme for task migrations. Particularly, we add a local shared key/value store on each node to make resources aware of the changes in the allocation plan. Thus, each resource can manage its tasks at runtime. Based on N-Storm, we further propose Online Task Deployment (OTD). Differing from traditional task deployment algorithms that deploy all tasks at once without considering the cost of task migrations caused by a task re-deployment, OTD can gradually adjust the current task deployment to an optimized one based on the communication cost and the runtime states of resources. We demonstrate that OTD can adapt to different kinds of applications including computation- and communication-intensive applications. The experimental results on a real DSPE cluster show that N-Storm can avoid the system stop and save up to 87% of the performance degradation time, compared with Apache Storm and other state-of-the-art approaches. In addition, OTD can increase the average CPU usage by 51% for computation-intensive applications and reduce network communication costs by 88% for communication-intensive applications.

基于风暴的分布式流处理引擎的在线不间断任务管理
大多数分布式流处理引擎(DSPE)不支持在线任务管理,无法适应时变数据流。最近,一些研究提出了在线任务部署算法来解决这一问题。然而,当任务部署在运行时发生变化时,这些方法无法保证服务质量(QoS),因为任务部署变化引起的任务迁移将带来高昂的成本。我们研究了最流行的 DSPE 之一 Apache Storm,发现当任务需要迁移时,Storm 必须停止部署任务的资源(在 Storm 中以 Worker 进程的形式实现)。这将导致资源中的所有任务停止并重新启动,从而导致任务迁移性能低下。为了解决这个问题,我们在本文中提出了 N-Storm(Nonstop Storm),它是一种任务与资源解耦的 DSPE。N-Storm 允许在运行时更改分配给资源的任务,这是由线程级任务迁移方案实现的。特别是,我们在每个节点上添加了一个本地共享键/值存储,以便让资源了解分配计划的变化。因此,每个资源都能在运行时管理自己的任务。在 N-Storm 的基础上,我们进一步提出了在线任务部署(OTD)。传统的任务部署算法会一次性部署所有任务,而不考虑任务重新部署带来的任务迁移成本,与之不同的是,OTD 可以根据通信成本和资源的运行状态,逐步调整当前的任务部署,使之达到最优。我们证明了 OTD 能够适应不同类型的应用,包括计算密集型和通信密集型应用。在一个真实的 DSPE 集群上的实验结果表明,与 Apache Storm 和其他最先进的方法相比,N-Storm 可以避免系统停止,并节省多达 87% 的性能下降时间。此外,对于计算密集型应用,OTD 可以将 CPU 的平均使用率提高 51%,而对于通信密集型应用,则可以将网络通信成本降低 88%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Computer Science and Technology
Journal of Computer Science and Technology 工程技术-计算机:软件工程
CiteScore
4.00
自引率
0.00%
发文量
2255
审稿时长
9.8 months
期刊介绍: Journal of Computer Science and Technology (JCST), the first English language journal in the computer field published in China, is an international forum for scientists and engineers involved in all aspects of computer science and technology to publish high quality and refereed papers. Papers reporting original research and innovative applications from all parts of the world are welcome. Papers for publication in the journal are selected through rigorous peer review, to ensure originality, timeliness, relevance, and readability. While the journal emphasizes the publication of previously unpublished materials, selected conference papers with exceptional merit that require wider exposure are, at the discretion of the editors, also published, provided they meet the journal''s peer review standards. The journal also seeks clearly written survey and review articles from experts in the field, to promote insightful understanding of the state-of-the-art and technology trends. Topics covered by Journal of Computer Science and Technology include but are not limited to: -Computer Architecture and Systems -Artificial Intelligence and Pattern Recognition -Computer Networks and Distributed Computing -Computer Graphics and Multimedia -Software Systems -Data Management and Data Mining -Theory and Algorithms -Emerging Areas
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信