2019 IEEE International Conference on Cloud Engineering (IC2E)最新文献_第2页

Host Hypervisor Trace Mining for Virtual Machine Workload Characterization 主机管理程序跟踪挖掘虚拟机工作负载表征

2019 IEEE International Conference on Cloud Engineering (IC2E) Pub Date : 2019-06-01 DOI: 10.1109/IC2E.2019.00024

Hani Nemati, S. V. Azhari, M. Dagenais

{"title":"Host Hypervisor Trace Mining for Virtual Machine Workload Characterization","authors":"Hani Nemati, S. V. Azhari, M. Dagenais","doi":"10.1109/IC2E.2019.00024","DOIUrl":"https://doi.org/10.1109/IC2E.2019.00024","url":null,"abstract":"The efficient operation and resource management of multi-tenant data centers hosting thousands of services is a demanding task, that requires precise and detailed information regarding the behaviour of each and every virtual machine (VM). Often, coarse measures such as CPU, memory, disk and network usage by VMs are considered in grouping them onto the same physical server, as detailed measures would require access to the guest operating system (OS), which is not feasible in a multi-tenant setting. In this paper, we propose host-level hypervisor tracing as a non-intrusive means to extract useful features, that can provide for fine grain characterization of VM behaviour. In particular, we extract VM blocking periods as well as virtual interrupt injection rates to detect multiple levels of resource intensiveness. In addition, we consider the resource contention rate due to other VMs and the host, along with reasons for exit from non-root to root privileged mode, revealing useful information about the nature of the underlying VM workload. We also use tracing to get information about the rate of process and thread preemption in each VM, extracting process and thread contention as another feature set. We then employ various feature selection strategies and assess the quality of the resulting workload clustering. Notably, we adopt a two-stage feature selection approach in addition to a one shot clustering scheme. Moreover, we consider inter-cluster and intra-cluster similarity metrics, such as the silhouette score, to discover distinct groups of workloads as well as workload groups with significant overlap. This information can be used by 1) data center administrators to gain deeper visibility into the nature of various VMs running on their infrastructure, 2) performance engineers to assist root cause analysis of VM issues and 3) IaaS providers to help in resource management based on VM behavior.","PeriodicalId":226094,"journal":{"name":"2019 IEEE International Conference on Cloud Engineering (IC2E)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130705032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Importance of Application-Level Resource Management in Multi-Cloud Deployments 应用级资源管理在多云部署中的重要性

2019 IEEE International Conference on Cloud Engineering (IC2E) Pub Date : 2019-06-01 DOI: 10.1109/IC2E.2019.00028

Z. Dimitrijevic, Cetin Sahin, Christian Tinnefeld, J. Patvarczki

{"title":"Importance of Application-Level Resource Management in Multi-Cloud Deployments","authors":"Z. Dimitrijevic, Cetin Sahin, Christian Tinnefeld, J. Patvarczki","doi":"10.1109/IC2E.2019.00028","DOIUrl":"https://doi.org/10.1109/IC2E.2019.00028","url":null,"abstract":"Cloud service providers started with Infrastructure as a Service (IaaS) offerings and over time expanded into Platform as a Service (PaaS) and Software as a Service (SaaS). Even though each provider has a rich product offering, there are many scenarios where a multi-cloud strategy is desirable: utilizing economic dynamics, preventing data lock-in with one vendor, circumventing geographic restrictions, complying with local regulations, or combining on-premise and public-cloud resources. The challenge from a consumer perspective with multi-cloud deployments is the lack of a common abstraction for the offered products and a standardized way to express all of the application requirements for the resulting deployments. In this paper, we contribute by making yet another case for multi-cloud deployments and by predicting the emergence of a new generation of application-level resource managers which will natively support multi-cloud for enterprise applications. We identify three main components of the feedback loop controlled application-level resource managers: the software life-cycle manager, the data storage and access manager, and the service execution manager.","PeriodicalId":226094,"journal":{"name":"2019 IEEE International Conference on Cloud Engineering (IC2E)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122158892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

ShadeNF: Testing Online Network Functions ShadeNF:在线网络功能测试

2019 IEEE International Conference on Cloud Engineering (IC2E) Pub Date : 2019-06-01 DOI: 10.1109/IC2E.2019.00027

Hui Lu, Abhinav Srivastava, Yu Sun

{"title":"ShadeNF: Testing Online Network Functions","authors":"Hui Lu, Abhinav Srivastava, Yu Sun","doi":"10.1109/IC2E.2019.00027","DOIUrl":"https://doi.org/10.1109/IC2E.2019.00027","url":null,"abstract":"The correct implementation of network policies for \"in-production\" network functions is critical, as it determines the security, availability and performance of a production network. Usually, conducting network testing for these network functions in a live production environment is attractive, as the production environment captures the most exact, realistic dynamic state and vulnerabilities of the system under test. However, doing so also brings potential risks of impacting or even damaging the production system. To address this tension, we present ShadeNF, a novel online platform for testing in-cloud network functions in a production-like environment, without disrupting the real production system. ShadeNF enables such a production-like environment with an exact live clone of production network functions and real production traffic as the test traffic. In designing and implementing ShadeNF, we address several key challenges and contribute new techniques in supporting such a testing platform, including an SDN-based live, consistent snapshot approach, a new programmable forwarding plane, and a scaled test traffic generator. We implement a ShadeNF prototype upon OpenStack and demonstrate that ShadeNF successfully captures the dynamics of production systems, and effectively localizes a range of policy violations in SDN/NFV systems.","PeriodicalId":226094,"journal":{"name":"2019 IEEE International Conference on Cloud Engineering (IC2E)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124576928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Information Models: Creating and Preserving Value in Volatile Cloud Resources 信息模型:在易变的云资源中创造和保存价值

2019 IEEE International Conference on Cloud Engineering (IC2E) Pub Date : 2019-06-01 DOI: 10.1109/IC2E.2019.00018

Chaojie Zhang, Varun Gupta, A. Chien

{"title":"Information Models: Creating and Preserving Value in Volatile Cloud Resources","authors":"Chaojie Zhang, Varun Gupta, A. Chien","doi":"10.1109/IC2E.2019.00018","DOIUrl":"https://doi.org/10.1109/IC2E.2019.00018","url":null,"abstract":"Volatile resources are surplus cloud resources not consumed by high priority foreground (reserved/on-demand) load. These resources are exploited by a growing number of users. Today, cloud operators provide no statistical characterization of volatile resources. We consider how releasing such statistics could improve user value by studying Amazon's 608 EC2 Spot Instance types. Results show that as little as two parameters such as (average, 90pctile) can increase user value by 30%. These results are robust over four-fifths (475 of 608) of instance types. Beyond competitive concerns, cloud operators are reluctant to share volatile resource statistics because they might be considered a service-level agreement (SLA), and thus constrain their ability to serve foreground load. We show that clever resource management can allay such concerns. We study two plausible classes of foreground load changes, showing one class where such a concern is indeed valid and another where it is not. We design two online resource management algorithms that detect foreground load variation and adapt to maintain a statistical SLA. The algorithms not only improve the ability to maintain guarantees and user value but also improve user experience, reducing job failures by 50%. These results apply to the Stable and Transition classes of instance types, which account for nearly all of the instance types (577 of 608).","PeriodicalId":226094,"journal":{"name":"2019 IEEE International Conference on Cloud Engineering (IC2E)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131754280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

SQUEET Program Committee SQUEET项目委员会

2019 IEEE International Conference on Cloud Engineering (IC2E) Pub Date : 2019-06-01 DOI: 10.1109/ic2e.2019.00-16

引用次数: 0

Continuous Benchmarking: Using System Benchmarking in Build Pipelines 持续基准测试:在构建管道中使用系统基准测试

2019 IEEE International Conference on Cloud Engineering (IC2E) Pub Date : 2019-06-01 DOI: 10.1109/IC2E.2019.00039

M. Grambow, Fabian Lehmann, David Bermbach

引用次数: 18

Toward a Workload Allocation Optimizer for Power Saving in Data Centers 面向数据中心节能的工作负载分配优化器

2019 IEEE International Conference on Cloud Engineering (IC2E) Pub Date : 2019-06-01 DOI: 10.1109/IC2E.2019.00019

Ying-Feng Hsu, H. Kuwahara, Kazuhiro Matsuda, Morito Matsuoka

{"title":"Toward a Workload Allocation Optimizer for Power Saving in Data Centers","authors":"Ying-Feng Hsu, H. Kuwahara, Kazuhiro Matsuda, Morito Matsuoka","doi":"10.1109/IC2E.2019.00019","DOIUrl":"https://doi.org/10.1109/IC2E.2019.00019","url":null,"abstract":"The number and scale of data centers are both rapidly increasing due to a continuously growing demand for cloud computing services from many areas. Cloud computing infrastructure relies on a massive amount of HPC servers to process millions of tasks and consumes an enormous amount of power. The implementation of advanced task allocation technology provides a solution for energy efficiency and has therefore become an essential goal for data centers. In this paper, we propose a novel CPU-intensive workload allocation optimizer (WAO) for the task of power saving within data centers. There are three major contributions to this research. First, a data center monitoring module, which continually reports the latest status of the data center and stores operational data. Second, we propose an accurate and efficient server power prediction model for all servers in the HPC clusters. Third, we provide an optimal task assignment engine that evaluates and assigns tasks to the most appropriate server to facilitate minimal power consumption. Our experimental results show that our proposed WAO can obtain about 29.6% power savings and 26% more productivity in a real data center.","PeriodicalId":226094,"journal":{"name":"2019 IEEE International Conference on Cloud Engineering (IC2E)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128095908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

The Future of Computing is Boring (and that is exciting!) 计算的未来是无聊的(这是令人兴奋的!)

2019 IEEE International Conference on Cloud Engineering (IC2E) Pub Date : 2019-06-01 DOI: 10.1109/IC2E.2019.00023

Aleksander Slominski, Vinod Muthusamy, Vatche Isahagian

引用次数: 1

Understanding Synchronization Costs for Distributed ML on Transient Cloud Resources 了解瞬时云资源上分布式ML的同步成本

2019 IEEE International Conference on Cloud Engineering (IC2E) Pub Date : 2019-06-01 DOI: 10.1109/IC2E.2019.00029

Pradeep Ambati, David E. Irwin, P. Shenoy, Lixin Gao, A. Ali-Eldin, Jeannie R. Albrecht

{"title":"Understanding Synchronization Costs for Distributed ML on Transient Cloud Resources","authors":"Pradeep Ambati, David E. Irwin, P. Shenoy, Lixin Gao, A. Ali-Eldin, Jeannie R. Albrecht","doi":"10.1109/IC2E.2019.00029","DOIUrl":"https://doi.org/10.1109/IC2E.2019.00029","url":null,"abstract":"Cloud platforms often execute parallel batch applications, such as distributed machine learning (ML), that include numerous synchronization barriers. These barriers, which prevent any task from advancing beyond a specified point until all tasks have reached that point, significantly degrade application performance by reducing it to that of the slowest \"straggler\" task. To address the problem, researchers have proposed numerous straggler mitigation techniques, including speculatively re-executing straggler tasks and various relaxations of strict barrier semantics. While these techniques improve parallel application performance, they incur a cost in terms of the resources wasted re-executing tasks or waiting. Importantly, these costs, which are often implicit in prior work that targets dedicated resources, become explicit in the cloud, which charges for resources at fine-grained intervals. In addition, the cost difference between techniques is exacerbated in cloud platforms, since they charge substantially less for transient resources that effectively yield a probabilistic performance across a wide range. While transient resources' low list price is attractive, revocations increase the frequency and severity of stragglers, which decreases parallel job performance and increases overall execution cost. To better understand the cost of synchronization, we develop simple analytical models of different straggler mitigation techniques and compare their cost and performance on on-demand and transient resources. Our analysis shows that i) transient servers offer complex tradeoffs compared to on-demand servers, and can result in higher overall costs despite their highly discounted price due to their probabilistic performance; ii) common approaches to straggler mitigation, which is a well-studied problem, are less effective using transient servers that cause frequent and severe stragglers; and iii) a recent approach to flexible synchronization offers the best cost and performance.","PeriodicalId":226094,"journal":{"name":"2019 IEEE International Conference on Cloud Engineering (IC2E)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130683944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Title Page iii 第三页标题

2019 IEEE International Conference on Cloud Engineering (IC2E) Pub Date : 2019-06-01 DOI: 10.1109/ic2e.2019.00002

引用次数: 0