A Network-aware and Partition-based Resource Management Scheme for Data Stream Processing

Yidan Wang, Z. Tari, Xiaoran Huang, Albert Y. Zomaya
{"title":"A Network-aware and Partition-based Resource Management Scheme for Data Stream Processing","authors":"Yidan Wang, Z. Tari, Xiaoran Huang, Albert Y. Zomaya","doi":"10.1145/3337821.3337870","DOIUrl":null,"url":null,"abstract":"With the increasing demand for data-driven decision making, there is an urgent need for processing geographically distributed data streams in real-time. The existing scheduling and resource management schemes efficiently optimize stream processing performance with the awareness of resource, quality-of-service, and network traffic. However, the correlation between network delay and inter-operator communication pattern is not well-understood. In this study, we propose a network-aware and partition-based resource management scheme to deal with the ever-changing network condition and data communication in stream processing. The proposed approach applies operator fusion by considering the computational demand of individual operators and the inter-operator communication patterns. It maps the fused operators to the clustered hosts with the weighted shortest processing time heuristic. Meanwhile, we established a 3-dimensional coordinate system for prompt reflection of the network condition, real-time traffic, and resource availability. We evaluated the proposed approach against two benchmarks, and the results demonstrate the efficiency in throughput and resource utilization. We also conducted a case study and implemented a prototype system supported by the proposed approach that aims to utilize the stream processing paradigm for pedestrian behavior analysis. The prototype application estimates walking time for a given path according to the real crowd traffic. The promising evaluation results of processing performance further illustrate the efficiency of the proposed approach.","PeriodicalId":405273,"journal":{"name":"Proceedings of the 48th International Conference on Parallel Processing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 48th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3337821.3337870","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

With the increasing demand for data-driven decision making, there is an urgent need for processing geographically distributed data streams in real-time. The existing scheduling and resource management schemes efficiently optimize stream processing performance with the awareness of resource, quality-of-service, and network traffic. However, the correlation between network delay and inter-operator communication pattern is not well-understood. In this study, we propose a network-aware and partition-based resource management scheme to deal with the ever-changing network condition and data communication in stream processing. The proposed approach applies operator fusion by considering the computational demand of individual operators and the inter-operator communication patterns. It maps the fused operators to the clustered hosts with the weighted shortest processing time heuristic. Meanwhile, we established a 3-dimensional coordinate system for prompt reflection of the network condition, real-time traffic, and resource availability. We evaluated the proposed approach against two benchmarks, and the results demonstrate the efficiency in throughput and resource utilization. We also conducted a case study and implemented a prototype system supported by the proposed approach that aims to utilize the stream processing paradigm for pedestrian behavior analysis. The prototype application estimates walking time for a given path according to the real crowd traffic. The promising evaluation results of processing performance further illustrate the efficiency of the proposed approach.
一种基于网络感知和分区的数据流处理资源管理方案
随着数据驱动决策需求的不断增长,迫切需要实时处理地理分布的数据流。现有的调度和资源管理方案通过对资源、服务质量和网络流量的感知,有效地优化了流处理性能。然而,网络延迟与运营商间通信模式之间的关系尚未得到很好的理解。在本研究中,我们提出了一种网络感知和基于分区的资源管理方案,以应对流处理中不断变化的网络条件和数据通信。该方法通过考虑单个算子的计算需求和算子间通信模式,实现算子融合。采用加权最短处理时间启发式算法将融合算子映射到集群主机。同时,我们建立了三维坐标系统,能够及时反映网络状况、实时流量和资源可用性。我们针对两个基准测试对所提出的方法进行了评估,结果证明了吞吐量和资源利用率方面的效率。我们还进行了一个案例研究,并实现了一个原型系统,该系统旨在利用流处理范式进行行人行为分析。原型应用程序根据真实的人群交通估计给定路径的步行时间。良好的处理性能评价结果进一步说明了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信