Matt Baughman, Ryan Chard, Logan T. Ward, Jason Pitt, K. Chard, Ian T Foster
{"title":"Profiling and Predicting Application Performance on the Cloud","authors":"Matt Baughman, Ryan Chard, Logan T. Ward, Jason Pitt, K. Chard, Ian T Foster","doi":"10.1109/UCC.2018.00011","DOIUrl":null,"url":null,"abstract":"Cloud providers continue to expand and diversify their collection of leasable resources to meet the needs of an increasingly wide range of applications. While this flexibility is a key benefit of the cloud, it also creates a complex landscape in which users are faced with many resource choices for a given application. Suboptimal selections can both degrade performance and increase costs. Given the rapidly evolving pool of resources, it is infeasible for users alone to select instance types; instead, automated methods are needed to simplify and guide resource provisioning. Here we present a method for the automatic prediction of application performance on arbitrary cloud instances. We combine offline and online profiling approaches, using historical data gathered from non-cloud environments and targeted profiling runs on cloud environments to create a composite application model that can predict run times on a given cloud instance type for a given input data size. We demonstrate average error of 17.2% across nine applications used in production bioinformatics workflows. Finally, we evaluate an experiment design approach to explore the trade-off between the cost of profiling and the accuracy of our models. Using this approach, with no prior knowledge, we show that using 4 selectively chosen experiments we can achieve performance within 30% of a model trained using all instance types.","PeriodicalId":288232,"journal":{"name":"2018 IEEE/ACM 11th International Conference on Utility and Cloud Computing (UCC)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/ACM 11th International Conference on Utility and Cloud Computing (UCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UCC.2018.00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Cloud providers continue to expand and diversify their collection of leasable resources to meet the needs of an increasingly wide range of applications. While this flexibility is a key benefit of the cloud, it also creates a complex landscape in which users are faced with many resource choices for a given application. Suboptimal selections can both degrade performance and increase costs. Given the rapidly evolving pool of resources, it is infeasible for users alone to select instance types; instead, automated methods are needed to simplify and guide resource provisioning. Here we present a method for the automatic prediction of application performance on arbitrary cloud instances. We combine offline and online profiling approaches, using historical data gathered from non-cloud environments and targeted profiling runs on cloud environments to create a composite application model that can predict run times on a given cloud instance type for a given input data size. We demonstrate average error of 17.2% across nine applications used in production bioinformatics workflows. Finally, we evaluate an experiment design approach to explore the trade-off between the cost of profiling and the accuracy of our models. Using this approach, with no prior knowledge, we show that using 4 selectively chosen experiments we can achieve performance within 30% of a model trained using all instance types.