Yidan Wang, Z. Tari, Xiaoran Huang, Albert Y. Zomaya
{"title":"A Network-aware and Partition-based Resource Management Scheme for Data Stream Processing","authors":"Yidan Wang, Z. Tari, Xiaoran Huang, Albert Y. Zomaya","doi":"10.1145/3337821.3337870","DOIUrl":null,"url":null,"abstract":"With the increasing demand for data-driven decision making, there is an urgent need for processing geographically distributed data streams in real-time. The existing scheduling and resource management schemes efficiently optimize stream processing performance with the awareness of resource, quality-of-service, and network traffic. However, the correlation between network delay and inter-operator communication pattern is not well-understood. In this study, we propose a network-aware and partition-based resource management scheme to deal with the ever-changing network condition and data communication in stream processing. The proposed approach applies operator fusion by considering the computational demand of individual operators and the inter-operator communication patterns. It maps the fused operators to the clustered hosts with the weighted shortest processing time heuristic. Meanwhile, we established a 3-dimensional coordinate system for prompt reflection of the network condition, real-time traffic, and resource availability. We evaluated the proposed approach against two benchmarks, and the results demonstrate the efficiency in throughput and resource utilization. We also conducted a case study and implemented a prototype system supported by the proposed approach that aims to utilize the stream processing paradigm for pedestrian behavior analysis. The prototype application estimates walking time for a given path according to the real crowd traffic. The promising evaluation results of processing performance further illustrate the efficiency of the proposed approach.","PeriodicalId":405273,"journal":{"name":"Proceedings of the 48th International Conference on Parallel Processing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 48th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3337821.3337870","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
With the increasing demand for data-driven decision making, there is an urgent need for processing geographically distributed data streams in real-time. The existing scheduling and resource management schemes efficiently optimize stream processing performance with the awareness of resource, quality-of-service, and network traffic. However, the correlation between network delay and inter-operator communication pattern is not well-understood. In this study, we propose a network-aware and partition-based resource management scheme to deal with the ever-changing network condition and data communication in stream processing. The proposed approach applies operator fusion by considering the computational demand of individual operators and the inter-operator communication patterns. It maps the fused operators to the clustered hosts with the weighted shortest processing time heuristic. Meanwhile, we established a 3-dimensional coordinate system for prompt reflection of the network condition, real-time traffic, and resource availability. We evaluated the proposed approach against two benchmarks, and the results demonstrate the efficiency in throughput and resource utilization. We also conducted a case study and implemented a prototype system supported by the proposed approach that aims to utilize the stream processing paradigm for pedestrian behavior analysis. The prototype application estimates walking time for a given path according to the real crowd traffic. The promising evaluation results of processing performance further illustrate the efficiency of the proposed approach.