CloudLEGO: scalable cross-VM-type application performance prediction

Proceedings of the 4th annual Symposium on Cloud Computing Pub Date : 2013-10-01 DOI:10.1145/2523616.2525948

S. Meng, A. Iyengar, Ling Liu, Ting Wang, Jian Tan, I. Silva-Lepe, I. Rouvellou

{"title":"CloudLEGO: scalable cross-VM-type application performance prediction","authors":"S. Meng, A. Iyengar, Ling Liu, Ting Wang, Jian Tan, I. Silva-Lepe, I. Rouvellou","doi":"10.1145/2523616.2525948","DOIUrl":null,"url":null,"abstract":"Understanding the performance difference of a multi-tier Cloud application between different provisioning plans and workloads is difficult to achieve. A typical IaaS provider offers a variety of virtual server instances with different performance capacities and rental rates. Such instances are often marked with a high level description of their hardware/software configuration (e.g. 1 or 2 vC-PUs) which provides insufficient information on the performance of the virtual server instances. Furthermore, as each tier of an application can be independently provisioned with different types and numbers of VMs, the number of possible provisioning plans grows exponentially with each additional tier. Previous work [10] proposed to perform automatic experiments to evaluate candidate provisioning plans, which leads to high cost due to the exponential increase of candidate provisioning plans with the number of tiers and available VM types. While several existing works [8, 6, 7] studied a variety of performance models for multi-tier applications, these works assume that an application runs on a fixed deployment (with fixed machine type and number for each tier). We present CloudLEGO, an efficient cross-VM-type performance learning and prediction approach. Since building a model for each possible deployment is clearly not scalable, instead of treating each candidate deployment separately, CloudLEGO views them as derivatives from a single, fixed deployment. Accordingly, the task of learning the performance of a targeted deployment can be decoupled into learning the performance of the original fixed deployment and learning the performance difference between the original deployment and the targeted one. The key to efficiently capture performance difference between deployments is to find multiple independent changes that can be used to derive any deployment from the original deployment. CloudLEGO formulates such \"modular\" changes as VM type changes at a given tier. To capture changes of performance at a tier caused by VM type changes, CloudLEGO uses relative performance models [5] which predict the performance difference between a pair of VMs (rather than the absolute performance of a VM) for a given workload. Moreover, training relative performance models requires only performance data from Cloud monitoring services [1, 4] rather than fine-grain data such as per-tier response time which requires application instrumentation. Training relative performance models with traditional passive learning techniques would require a large amount of training data as performance data are collected uniformly in a single batch. We find that different types of VMs often share similar performance for many \"regions\" of workloads. To leverage this characteristic and guide the profiling to regions with high performance differences, CloudLEGO uses active learning techniques [2, 3, 9] that split the profiling process into multiple stages where data collected in one stage are used to identify high-value regions for the next profiling stage. As a result, it significantly speeds up the convergence of models and the profiling process due to substantially reduced measurement. We deploy CloudLEGO in IBM's Research Computing Cloud (RC2), an Infrastructure-as-a-Service Cloud, to evaluate its effectiveness. Our results suggest that CloudLEGO provides accurate predictions for various deployments and workloads with only a fraction of training cost incurred by existing techniques.","PeriodicalId":298547,"journal":{"name":"Proceedings of the 4th annual Symposium on Cloud Computing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th annual Symposium on Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2523616.2525948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Understanding the performance difference of a multi-tier Cloud application between different provisioning plans and workloads is difficult to achieve. A typical IaaS provider offers a variety of virtual server instances with different performance capacities and rental rates. Such instances are often marked with a high level description of their hardware/software configuration (e.g. 1 or 2 vC-PUs) which provides insufficient information on the performance of the virtual server instances. Furthermore, as each tier of an application can be independently provisioned with different types and numbers of VMs, the number of possible provisioning plans grows exponentially with each additional tier. Previous work [10] proposed to perform automatic experiments to evaluate candidate provisioning plans, which leads to high cost due to the exponential increase of candidate provisioning plans with the number of tiers and available VM types. While several existing works [8, 6, 7] studied a variety of performance models for multi-tier applications, these works assume that an application runs on a fixed deployment (with fixed machine type and number for each tier). We present CloudLEGO, an efficient cross-VM-type performance learning and prediction approach. Since building a model for each possible deployment is clearly not scalable, instead of treating each candidate deployment separately, CloudLEGO views them as derivatives from a single, fixed deployment. Accordingly, the task of learning the performance of a targeted deployment can be decoupled into learning the performance of the original fixed deployment and learning the performance difference between the original deployment and the targeted one. The key to efficiently capture performance difference between deployments is to find multiple independent changes that can be used to derive any deployment from the original deployment. CloudLEGO formulates such "modular" changes as VM type changes at a given tier. To capture changes of performance at a tier caused by VM type changes, CloudLEGO uses relative performance models [5] which predict the performance difference between a pair of VMs (rather than the absolute performance of a VM) for a given workload. Moreover, training relative performance models requires only performance data from Cloud monitoring services [1, 4] rather than fine-grain data such as per-tier response time which requires application instrumentation. Training relative performance models with traditional passive learning techniques would require a large amount of training data as performance data are collected uniformly in a single batch. We find that different types of VMs often share similar performance for many "regions" of workloads. To leverage this characteristic and guide the profiling to regions with high performance differences, CloudLEGO uses active learning techniques [2, 3, 9] that split the profiling process into multiple stages where data collected in one stage are used to identify high-value regions for the next profiling stage. As a result, it significantly speeds up the convergence of models and the profiling process due to substantially reduced measurement. We deploy CloudLEGO in IBM's Research Computing Cloud (RC2), an Infrastructure-as-a-Service Cloud, to evaluate its effectiveness. Our results suggest that CloudLEGO provides accurate predictions for various deployments and workloads with only a fraction of training cost incurred by existing techniques.

查看原文本刊更多论文

CloudLEGO:可扩展的跨虚拟机类型的应用程序性能预测

理解多层云应用程序在不同供应计划和工作负载之间的性能差异是很困难的。典型的IaaS提供商提供具有不同性能容量和租金的各种虚拟服务器实例。这样的实例通常被标记为其硬件/软件配置的高级描述(例如1或2个vc - pu)，这提供了关于虚拟服务器实例性能的不足信息。此外，由于应用程序的每一层都可以独立地配置不同类型和数量的vm，因此可能的配置计划的数量会随着每增加一层而呈指数级增长。先前的工作[10]提出通过自动实验来评估候选供应计划，由于候选供应计划随着层数和可用VM类型的增加呈指数增长，导致成本较高。虽然已有的一些研究[8,6,7]研究了多层应用程序的各种性能模型，但这些研究假设应用程序在固定部署上运行(每层都有固定的机器类型和数量)。我们提出CloudLEGO，一种高效的跨虚拟机类型的性能学习和预测方法。由于为每个可能的部署构建模型显然是不可伸缩的，因此CloudLEGO没有单独处理每个候选部署，而是将它们视为单个固定部署的衍生物。因此，学习目标部署性能的任务可以解耦为学习原始固定部署的性能和学习原始部署与目标部署之间的性能差异。有效捕获部署之间性能差异的关键是找到可用于从原始部署派生任何部署的多个独立更改。CloudLEGO将这种“模块化”更改表述为在给定层上更改VM类型。为了捕获由虚拟机类型变化引起的一层性能变化，CloudLEGO使用了相对性能模型[5]，该模型预测给定工作负载下一对虚拟机之间的性能差异(而不是虚拟机的绝对性能)。此外，训练相对性能模型只需要来自云监控服务的性能数据[1,4]，而不是需要应用程序检测的每层响应时间等细粒度数据。传统被动学习技术训练相对性能模型时，由于性能数据是统一采集的，需要大量的训练数据。我们发现，不同类型的vm通常在许多“区域”的工作负载上共享相似的性能。为了利用这一特性并将分析引导到具有高性能差异的区域，CloudLEGO使用主动学习技术[2,3,9]，将分析过程分成多个阶段，其中一个阶段收集的数据用于识别下一个分析阶段的高价值区域。因此，由于大大减少了测量，它大大加快了模型的收敛和分析过程。我们将CloudLEGO部署在IBM的研究计算云(RC2)中，这是一个基础设施即服务云，以评估其有效性。我们的结果表明，CloudLEGO为各种部署和工作负载提供了准确的预测，而现有技术的培训成本仅占一小部分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 4th annual Symposium on Cloud Computing

自引率

0.00%

发文量