Riza O. Suminto, Cesar A. Stuardo, Alexandra Clark, Huan Ke, Tanakorn Leesatapornwongsa, Bo Fu, D. Kurniawan, V. Martin, Maheswara Rao G. Uma, Haryadi S. Gunawi
{"title":"PBSE: a robust path-based speculative execution for degraded-network tail tolerance in data-parallel frameworks","authors":"Riza O. Suminto, Cesar A. Stuardo, Alexandra Clark, Huan Ke, Tanakorn Leesatapornwongsa, Bo Fu, D. Kurniawan, V. Martin, Maheswara Rao G. Uma, Haryadi S. Gunawi","doi":"10.1145/3127479.3131622","DOIUrl":"https://doi.org/10.1145/3127479.3131622","url":null,"abstract":"We reveal loopholes of Speculative Execution (SE) implementations under a unique fault model: node-level network throughput degradation. This problem appears in many data-parallel frameworks such as Hadoop MapReduce and Spark. To address this, we present PBSE, a robust, path-based speculative execution that employs three key ingredients: path progress, path diversity, and path-straggler detection and speculation. We show how PBSE is superior to other approaches such as cloning and aggressive speculation under the aforementioned fault model. PBSE is a general solution, applicable to many data-parallel frameworks such as Hadoop/HDFS+QFS, Spark and Flume.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74909986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anil Shanbhag, Alekh Jindal, S. Madden, Jorge-Arnulfo Quiané-Ruiz, Aaron J. Elmore
{"title":"A robust partitioning scheme for ad-hoc query workloads","authors":"Anil Shanbhag, Alekh Jindal, S. Madden, Jorge-Arnulfo Quiané-Ruiz, Aaron J. Elmore","doi":"10.1145/3127479.3131613","DOIUrl":"https://doi.org/10.1145/3127479.3131613","url":null,"abstract":"Data partitioning is crucial to improving query performance several workload-based partitioning techniques have been proposed in database literature. However, many modern analytic applications involve ad-hoc or exploratory analysis where users do not have a representative query workload a priori. Static workload-based data partitioning techniques are therefore not suitable for such settings. In this paper, we propose Amoeba, a distributed storage system that uses adaptive multi-attribute data partitioning to efficiently support ad-hoc as well as recurring queries. Amoeba requires zero set-up and tuning effort, allowing analysts to get the benefits of partitioning without requiring an upfront query workload. The key idea is to build and maintain a partitioning tree on top of the dataset. The partitioning tree allows us to answer queries with predicates by reading a subset of the data. The initial partitioning tree is created without requiring an upfront query workload and Amoeba adapts it over time by incrementally modifying subtrees based on user queries using repartitioning. A prototype of Amoeba running on top of Apache Spark improves query performance by up to 7x over full scans and up to 2x over range-based partitioning techniques on TPC-H as well as a real-world workload.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90347291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"mBalloon: enabling elastic memory management for big data processing","authors":"Wei Chen, Aidi Pi, J. Rao, Xiaobo Zhou","doi":"10.1145/3127479.3132565","DOIUrl":"https://doi.org/10.1145/3127479.3132565","url":null,"abstract":"Big Data processing often suffers from significant memory pressure, resulting in excessive garbage collection (GC) and out-of-memory (OOM) errors, harming system performance and reliability. Therefore, users tend to give an excessive heap size to applications to avoid job failure, causing low cluster utilization. In this paper, we demonstrate that lightweight virtualization, such as OS containers, opens up opportunities to address memory pressure: 1) tasks running in a container can be set to a large heap size to avoid OOM errors without worrying about thrashing the host machine; 2) tasks that are under memory pressure and incur significant GC activities can be temporarily \"suspended\" by depriving the hosting container's resources, and can be \"resumed\" later when other tasks complete and release their resources. We propose and develop mBalloon, an elastic memory manager, that leverages containers to flexibly and precisely control the memory usage of big data tasks. Applications running with mBalloon can survive from memory pressure, incur less GC overhead and help improve cluster utilization.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81731890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A policy-based system for dynamic scaling of virtual machine memory reservations","authors":"R. Smith, S. Rixner","doi":"10.1145/3127479.3127491","DOIUrl":"https://doi.org/10.1145/3127479.3127491","url":null,"abstract":"To maximize the effectiveness of modern virtualization systems, resources must be allocated fairly and efficiently amongst virtual machines (VMs). However, current policies for allocating memory are relatively static. As a result, system-wide memory utilization is often sub-optimal, leading to unnecessary paging and performance degradation. To better utilize the large-scale memory resources of modern machines, the virtualization system must allow virtual machines to expand beyond their initial memory reservations, while still fairly supporting concurrent virtual machines. This paper presents a system for dynamically allocating memory amongst virtual machines at runtime, as well as an evaluation of six allocation policies implemented within the system. The system allows guest VMs to expand and contract according to their changing demands by uniquely improving and integrating mechanisms such as memory ballooning, memory hotplug, and hypervisor paging. Furthermore, the system provides fairness by guaranteeing each guest a minimum reservation, charging for rentals beyond this minimum, and enforcing timely reclamation of memory.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77495481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"WorkloadCompactor: reducing datacenter cost while providing tail latency SLO guarantees","authors":"T. Zhu, M. Kozuch, Mor Harchol-Balter","doi":"10.1145/3127479.3132245","DOIUrl":"https://doi.org/10.1145/3127479.3132245","url":null,"abstract":"Service providers want to reduce datacenter costs by consolidating workloads onto fewer servers. At the same time, customers have performance goals, such as meeting tail latency Service Level Objectives (SLOs). Consolidating workloads while meeting tail latency goals is challenging, especially since workloads in production environments are often bursty. To limit the congestion when consolidating workloads, customers and service providers often agree upon rate limits. Ideally, rate limits are chosen to maximize the number of workloads that can be co-located while meeting each workload's SLO. In reality, neither the service provider nor customer knows how to choose rate limits. Customers end up selecting rate limits on their own in some ad hoc fashion, and service providers are left to optimize given the chosen rate limits. This paper describes WorkloadCompactor, a new system that uses workload traces to automatically choose rate limits simultaneously with selecting onto which server to place workloads. Our system meets customer tail latency SLOs while minimizing datacenter resource costs. Our experiments show that by optimizing the choice of rate limits, WorkloadCompactor reduces the number of required servers by 30--60% as compared to state-of-the-art approaches.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77511418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heungsik Choi, Gyeongsik Yang, Kyungwoon Lee, C. Yoo
{"title":"KVS: high-efficiency kernel-level virtual switch","authors":"Heungsik Choi, Gyeongsik Yang, Kyungwoon Lee, C. Yoo","doi":"10.1145/3127479.3131615","DOIUrl":"https://doi.org/10.1145/3127479.3131615","url":null,"abstract":"In clouds, virtual switch (vSwitch) is in charge of packet forwarding between virtual machines (VMs). However, kernel-based vSwitches show throughput degradation for intensive packet processing; this becomes a bottleneck for the network performance of clouds. DPDK-based vSwitch (DPDK vSwitch) [1] has been developed to resolve the performance problem. Although it exhibits high throughput, DPDK vSwitch has two weak points. First, it consumes excessive memory. DPDK vSwitch uses huge page to reduce the number of memory operations, and this design causes high memory consumption even when the traffic is low. According to [2], memory determines the available number of VMs per single physical server. Thus, saving the memory decreases the capital expenditure of clouds. Second, security is another concern of the DPDK vSwitch, because its data plane is exposed to user space with the shared memory [3]. Therefore, the isolation of packets across VMs cannot be guaranteed. To overcome the excessive memory use and security concern, we propose a new kernel-level vSwitch (KVS) based on Linux. KVS do not use huge page nor bypass kernel stack. Instead, KVS applies the following key ideas to enhance the throughput.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87264654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amedeo Sapio, I. Abdelaziz, Marco Canini, Panos Kalnis
{"title":"DAIET","authors":"Amedeo Sapio, I. Abdelaziz, Marco Canini, Panos Kalnis","doi":"10.1145/3127479.3132018","DOIUrl":"https://doi.org/10.1145/3127479.3132018","url":null,"abstract":"1 CONTEXT AND MOTIVATION Many data center applications nowadays rely on distributed computation models like MapReduce and Bulk Synchronous Parallel (BSP) for data-intensive computation at scale [4]. These models scale by leveraging the partition/aggregate pattern where data and computations are distributed across many worker servers, each performing part of the computation. A communication phase is needed each time workers need to synchronize the computation and, at last, to produce the final output. In these applications, the network communication costs can be one of the dominant scalability bottlenecks especially in case of multi-stage or iterative computations [1]. The advent of flexible networking hardware and expressive data plane programming languages have produced networks that are deeply programmable [2]. This creates the opportunity to co-design distributed systems with their network layer, which can offer substantial performance benefits. A possible use of this emerging technology is to execute the logic traditionally associated with the application layer into the network itself. Given that in the above mentioned applications the intermediate results are necessarily exchanged through the network, it is desirable to offload to it part of the aggregation task to reduce the traffic and lessen the work of the servers. However, these programmable networking devices typically have very stringent constraints on the number and type of operations that can be performed at line rate. Moreover, packet processing at high speed requires a very fast memory, such as TCAM or SRAM, which is expensive and usually available in small capacities.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87456845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HotSpot: automated server hopping in cloud spot markets","authors":"Supreeth Shastri, David E. Irwin","doi":"10.1145/3127479.3132017","DOIUrl":"https://doi.org/10.1145/3127479.3132017","url":null,"abstract":"Cloud spot markets offer virtual machines (VMs) for a dynamic price that is much lower than the fixed price of on-demand VMs. In exchange, spot VMs expose applications to multiple forms of risk, including price risk, or the risk that a VM's price will increase relative to others. Since spot prices vary continuously across hundreds of different types of VMs, flexible applications can mitigate price risk by moving to the VM that currently offers the lowest cost. To enable this flexibility, we present HotSpot, a resource container that \"hops\" VMs---by dynamically selecting and self-migrating to new VMs---as spot prices change. HotSpot containers define a migration policy that lowers cost by determining when to hop VMs based on the transaction costs (from vacating a VM early and briefly double paying for it) and benefits (the expected cost savings). As a side effect of migrating to minimize cost, HotSpot is also able to reduce the number of revocations without degrading performance. HotSpot is simple and transparent: since it operates at the systems-level on each host VM, users need only run an HotSpot-enabled VM image to use it. We implement a HotSpot prototype on EC2, and evaluate it using job traces from a production Google cluster. We then compare HotSpot to using on-demand VMs and spot VMs (with and without fault-tolerance) in EC2, and show that it is able to lower cost and reduce the number of revocations without degrading performance.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90299872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Search lookaside buffer: efficient caching for index data structures","authors":"Xingbo Wu, Fan Ni, Song Jiang","doi":"10.1145/3127479.3127483","DOIUrl":"https://doi.org/10.1145/3127479.3127483","url":null,"abstract":"With the ever increasing DRAM capacity in commodity computers, applications tend to store large amount of data in main memory for fast access. Accordingly, efficient traversal of index structures to locate requested data becomes crucial to their performance. The index data structures grow so large that only a fraction of them can be cached in the CPU cache. The CPU cache can leverage access locality to keep the most frequently used part of an index in it for fast access. However, the traversal on the index to a target data during a search for a data item can result in significant false temporal and spatial localities, which make CPU cache space substantially underutilized. In this paper we show that even for highly skewed accesses the index traversal incurs excessive cache misses leading to suboptimal data access performance. To address the issue, we introduce Search Lookaside Buffer (SLB) to selectively cache only the search results, instead of the index itself. SLB can be easily integrated with any index data structure to increase utilization of the limited CPU cache resource and improve throughput of search requests on a large data set. We integrate SLB with various index data structures and applications. Experiments show that SLB can improve throughput of the index data structures by up to an order of magnitude. Experiments with real-world key-value traces also show up to 73% throughput improvement on a hash table.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90361291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Lugones, Jordi Arjona Aroca, Yue Jin, A. Sala, V. Hilt
{"title":"AidOps: a data-driven provisioning of high-availability services in cloud","authors":"D. Lugones, Jordi Arjona Aroca, Yue Jin, A. Sala, V. Hilt","doi":"10.1145/3127479.3129250","DOIUrl":"https://doi.org/10.1145/3127479.3129250","url":null,"abstract":"The virtualization of services with high-availability requirements calls to revisit traditional operation and provisioning processes. Providers are realizing services in software on virtual machines instead of using dedicated appliances to dynamically adjust service capacity to changing demands. Cloud orchestration systems control the number of service instances deployed to make sure each service has enough capacity to meet incoming workloads. However, determining the suitable build-out of a service is challenging as it takes time to install new instances and excessive re-configurations (i.e. scale in/out) can lead to decreased stability. In this paper we present AidOps, a cloud orchestration system that leverages machine learning and domain-specific knowledge to predict the traffic demand, optimizing service performance and cost. AidOps does not require a conservative provisioning of services to cover for the worst-case demand and significantly reduces operational costs while still fulfilling service quality expectations. We have evaluated our framework with real traffic using an enterprise application and a communication service in a private cloud. Our results show up to 4X improvement in service performance indicators compared to existing orchestration systems. AidOps achieves up to 99.985% availability levels while reducing operational costs at least by 20%.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86306590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}