{"title":"HW3C","authors":"Yuewen Wu, Heng Wu, Wen-bo Zhang, Yuanjia Xu, Jun Wei, Hua Zhong","doi":"10.1145/3275219.3275224","DOIUrl":null,"url":null,"abstract":"It is a big challenge to pick up the best cloud configuration for recurring big data analytics jobs running in clouds. Prior efforts may get in a sub-optimal configuration due to a broad spectrum of cloud configurations with a few test runs, such as CherryPick. We present HW3C which is a heuristic based workload classification and cloud configuration system for big data analytics jobs, our insight is classifying a job by comparing its resource preference and usage informantion with other jobs, and then using heuristic rules to distinguish bad samples from good ones in Bayesian Optimization algorithm. Our experiments on HiBench and SparkBench in Aliyun ECS show that the performance of job had been improved by 53% in average comparing with CherryPick, meanwhile the resource cost had been reduced by 40% in average.","PeriodicalId":184857,"journal":{"name":"Proceedings of the Tenth Asia-Pacific Symposium on Internetware","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth Asia-Pacific Symposium on Internetware","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3275219.3275224","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
It is a big challenge to pick up the best cloud configuration for recurring big data analytics jobs running in clouds. Prior efforts may get in a sub-optimal configuration due to a broad spectrum of cloud configurations with a few test runs, such as CherryPick. We present HW3C which is a heuristic based workload classification and cloud configuration system for big data analytics jobs, our insight is classifying a job by comparing its resource preference and usage informantion with other jobs, and then using heuristic rules to distinguish bad samples from good ones in Bayesian Optimization algorithm. Our experiments on HiBench and SparkBench in Aliyun ECS show that the performance of job had been improved by 53% in average comparing with CherryPick, meanwhile the resource cost had been reduced by 40% in average.