Proceedings of the 2017 Symposium on Cloud Computing最新文献

Reducing tail latencies in micro-batch streaming workloads 减少微批处理流工作负载的尾部延迟

Proceedings of the 2017 Symposium on Cloud Computing Pub Date : 2017-09-24 DOI: 10.1145/3127479.3134433

Faria Kalim, A. Tantawi, S. Costache, A. Youssef

引用次数: 0

AKC: advanced KSM for cloud computing AKC:用于云计算的高级KSM

Proceedings of the 2017 Symposium on Cloud Computing Pub Date : 2017-09-24 DOI: 10.1145/3127479.3131616

Sioh Lee, Bongkyu Kim, Youngpil Kim, C. Yoo

{"title":"AKC: advanced KSM for cloud computing","authors":"Sioh Lee, Bongkyu Kim, Youngpil Kim, C. Yoo","doi":"10.1145/3127479.3131616","DOIUrl":"https://doi.org/10.1145/3127479.3131616","url":null,"abstract":"Kernel samepage merging (KSM) in Linux kernel archive is a memory deduplication scheme that finds duplicate pages and shares the page in order to alleviate memory bottleneck in cloud. However, because the KSM has to scan all pages in memory to find duplicate pages, KSM consumes high CPU cycles and so causes virtual machines (VMs) performance degradation [1]. This degradation of VMs performance is an obstacle in cloud to service real-time applications (i.e. Netflix) [3]. A previous work, CMD [1] proposed page grouping scheme to reduce page comparisons, but it requires special monitoring hardware, XLH [2] enhanced page sharing with the information of guest VM I/O operation. However, the CPU overhead of XLH is still very high - similar to the default KSM. to make KSM more useful, we need an optimization scheme that consume less CPU cycles. Therefore, we first profile the CPU cycle consumption of KSM and the results show that page comparison (28.77%) and page checksum (26.14%) take most of cycles. Based on the results, we propose advanced KSM for cloud computing (AKC) that consumes less CPU cycles than the default KSM. to reduce the number of page comparisons, we apply checksum based RB-tree structure. In addition, AKC decreases page checksum overhead with hardware-accelerated crc32 hash function.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75893415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Analysis of TPC-DS: the first standard benchmark for SQL-based big data systems TPC-DS分析:基于sql的大数据系统的第一个标准基准

Proceedings of the 2017 Symposium on Cloud Computing Pub Date : 2017-09-24 DOI: 10.1145/3127479.3128603

Meikel Pöss, T. Rabl, H. Jacobsen

{"title":"Analysis of TPC-DS: the first standard benchmark for SQL-based big data systems","authors":"Meikel Pöss, T. Rabl, H. Jacobsen","doi":"10.1145/3127479.3128603","DOIUrl":"https://doi.org/10.1145/3127479.3128603","url":null,"abstract":"The advent of Web 2.0 companies, such as Facebook, Google, and Amazon with their insatiable appetite for vast amounts of structured, semi-structured, and unstructured data, triggered the development of Hadoop and related tools, e.g., YARN, MapReduce, and Pig, as well as NoSQL databases. These tools form an open source software stack to support the processing of large and diverse data sets on clustered systems to perform decision support tasks. Recently, SQL is resurrecting in many of these solutions, e.g., Hive, Stinger, Impala, Shark, and Presto. At the same time, RDBMS vendors are adding Hadoop support into their SQL engines, e.g., IBM's Big SQL, Actian's Vortex, Oracle's Big Data SQL, and SAP's HANA. Because there was no industry standard benchmark that could measure the performance of SQL-based big data solutions, marketing claims were mostly based on \"cherry picked\" subsets of the TPC-DS benchmark to suit individual companies strengths, while blending out their weaknesses. In this paper, we present and analyze our work on modifying TPC-DS to fill the void for an industry standard benchmark that is able to measure the performance of SQL-based big data solutions. The new benchmark was ratified by the TPC in early 2016. To show the significance of the new benchmark, we analyze performance data obtained on four different systems running big data, traditional RDBMS, and columnar in-memory architectures.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"104 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76669164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 36

STYX: a trusted and accelerated hierarchical SSL key management and distribution system for cloud based CDN application STYX:一个可信和加速的分层SSL密钥管理和分发系统，用于基于云的CDN应用

Proceedings of the 2017 Symposium on Cloud Computing Pub Date : 2017-09-24 DOI: 10.1145/3127479.3127482

Changzheng Wei, Jian Li, Weigang Li, Ping Yu, Haibing Guan

{"title":"STYX: a trusted and accelerated hierarchical SSL key management and distribution system for cloud based CDN application","authors":"Changzheng Wei, Jian Li, Weigang Li, Ping Yu, Haibing Guan","doi":"10.1145/3127479.3127482","DOIUrl":"https://doi.org/10.1145/3127479.3127482","url":null,"abstract":"Protecting the customer's SSL private key is the paramount issue to persuade the website owners to migrate their contents onto the cloud infrastructure, besides the advantages of cloud infrastructure in terms of flexibility, efficiency, scalability and elasticity. The emerging Keyless SSL solution retains on-premise custody of customers' SSL private keys on their own servers. However, it suffers from significant performance degradation and limited scalability, caused by the long distance connection to Key Server for each new coming end-user request. The performance improvements using persistent session and key caching onto cloud will degrade the key invulnerability and discourage the website owners because of the cloud's security bugs. In this paper, the challenges of secured key protection and distribution are addressed in philosophy of \"Storing the trusted DATA on untrusted platform and transmitting through untrusted channel\". To this end, a three-phase hierarchical key management scheme, called STYX1 is proposed to provide the secured key protection together with hardware assisted service acceleration for cloud-based content delivery network (CCDN) applications. The STYX is implemented based on Intel Software Guard Extensions (SGX), Intel QuickAssist Technology (QAT) and SIGMA (SIGn-and-MAc) protocol. STYX can provide the tight key security guarantee by SGX based key distribution with a light overhead, and it can further significantly enhance the system performance with QAT based acceleration. The comprehensive evaluations show that the STYX not only guarantees the absolute security but also outperforms the direct HTTPS server deployed CDN without QAT by up to 5x throughput with significant latency reduction at the same time.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87431214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Polygravity: traffic usage accountability via coarse-grained measurements in multi-tenant data centers Polygravity:在多租户数据中心中通过粗粒度测量实现流量使用责任

Proceedings of the 2017 Symposium on Cloud Computing Pub Date : 2017-09-24 DOI: 10.1145/3127479.3129258

H. Baek, Cheng Jin, Guofei Jiang, C. Lumezanu, J. Merwe, Ning Xia, Qiang Xu

{"title":"Polygravity: traffic usage accountability via coarse-grained measurements in multi-tenant data centers","authors":"H. Baek, Cheng Jin, Guofei Jiang, C. Lumezanu, J. Merwe, Ning Xia, Qiang Xu","doi":"10.1145/3127479.3129258","DOIUrl":"https://doi.org/10.1145/3127479.3129258","url":null,"abstract":"Network usage accountability is critical in helping operators and customers of multi-tenant data centers deal with concerns such as capacity planning, resource allocation, hotspot detection, link failure detection, and troubleshooting. However, the cost of measurements and instrumentation to achieve flow-level accountability is non-trivial. We propose Polygravity to determine tenant traffic usage via lightweight measurements in multi-tenant data centers. We adopt a tomogravity model widely used in ISP networks, and adapt it to a multi-tenant data center environment. By integrating datacenter-specific domain knowledge, sampling-based partial estimation and gravity-based internal sinks/sources estimation, Polygravity addresses two key challenges for adapting tomogravity to a data center environment: sparse traffic matrices and internal traffic sinks/sources. We conducted extensive evaluation of our approach using realistic data center workloads. Our results show that Polygravity can determine tenant IP flow usage with less than 1% average relative error for tenants with fine-grained domain knowledge. In addition, for tenants with coarse-grained domain knowledge and with partial host-based sampling, Polygravity reduces the relative error of sampling-based estimation by 1/3.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91199938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

SEaMLESS: a SErvice migration cLoud architecture for energy saving and memory releaSing capabilities 无缝:一种服务迁移云架构，具有节能和内存释放功能

Proceedings of the 2017 Symposium on Cloud Computing Pub Date : 2017-09-24 DOI: 10.1145/3127479.3128604

Dino Lopez Pacheco, Quentin Jacquemart, A. Segalini, M. Rifai, M. Dione, G. Urvoy-Keller

引用次数: 1

CapNet: security and least authority in a capability-enabled cloud CapNet:启用功能的云中的安全性和最低权限

Proceedings of the 2017 Symposium on Cloud Computing Pub Date : 2017-09-24 DOI: 10.1145/3127479.3131209

A. Burtsev, David Johnson, Josh Kunz, E. Eide, J. Merwe

引用次数: 7

To edge or not to edge? 边缘还是不边缘?

Proceedings of the 2017 Symposium on Cloud Computing Pub Date : 2017-09-24 DOI: 10.1145/3127479.3132572

Faria Kalim, S. Noghabi, Shiv Verma

引用次数: 7

On-demand virtualization for live migration in bare metal cloud 裸机云中实时迁移的按需虚拟化

Proceedings of the 2017 Symposium on Cloud Computing Pub Date : 2017-09-24 DOI: 10.1145/3127479.3129254

Jae-Hwa Im, Jongyul Kim, Jonguk Kim, Seongwook Jin, S. Maeng

{"title":"On-demand virtualization for live migration in bare metal cloud","authors":"Jae-Hwa Im, Jongyul Kim, Jonguk Kim, Seongwook Jin, S. Maeng","doi":"10.1145/3127479.3129254","DOIUrl":"https://doi.org/10.1145/3127479.3129254","url":null,"abstract":"The level of demand for bare-metal cloud services has increased rapidly because such services are cost-effective for several types of workloads, and some cloud clients prefer a single-tenant environment due to the lower security vulnerability of such enviornments. However, as the bare-metal cloud does not utilize a virtualization layer, it cannot use live migration. Thus, there is a lack of manageability with the bare-metal cloud. Live migration support can improve the manageability of bare-metal cloud services significantly. This paper suggests an on-demand virtualization technique to improve the manageability of bare-metal cloud services. A thin virtualization layer is inserted into the bare-metal cloud when live migration is requested. After the completion of the live migration process, the thin virtualization layer is removed from the host. We modified BitVisor [19] to implement on-demand virtualization and live migration on the x86 architecture. The elapsed time of on-demand virtualization was negligible. It takes about 20 ms to insert the virtualization layer and 30 ms to remove the one. After removing the virtualization layer, the host machine works with bare-metal performance.","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72852558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Stocator: an object store aware connector for apache spark Stocator:一个用于apache spark的对象存储感知连接器

Proceedings of the 2017 Symposium on Cloud Computing Pub Date : 2017-09-24 DOI: 10.1145/3127479.3134761

G. Vernik, M. Factor, E. K. Kolodner, Effi Ofer, P. Michiardi, Francesco Pace

{"title":"Stocator: an object store aware connector for apache spark","authors":"G. Vernik, M. Factor, E. K. Kolodner, Effi Ofer, P. Michiardi, Francesco Pace","doi":"10.1145/3127479.3134761","DOIUrl":"https://doi.org/10.1145/3127479.3134761","url":null,"abstract":"Data is the natural resource of the 21st century. It is being produced at dizzying rates, e.g., for genomics, for media and entertainment, and for Internet of Things. Object storage systems such as Amazon S3, Azure Blob storage, and IBM Cloud Object Storage, are highly scalable distributed storage systems that offer high capacity, cost effective storage. But it is not enough just to store data; we also need to derive value from it. Apache Spark is the leading big data analytics processing engine combining MapReduce, SQL, streaming, and complex analytics. We present Stocator, a high performance storage connector, enabling Spark to work directly on data stored in object storage systems, while providing the same correctness guarantees as Hadoop's original storage system, HDFS. Current object storage connectors from the Hadoop community, e.g., for the S3 and Swift APIs, do not deal well with eventual consistency, which can lead to failure. These connectors assume file system semantics, which is natural given that their model of operation is based on interaction with HDFS. In particular, Spark and Hadoop achieve fault tolerance and enable speculative execution by creating temporary files, listing directories to identify these files, and then renaming them. This paradigm avoids interference between tasks doing the same work and thus writing output with the same name. However, with eventually consistent object storage, a container listing may not yet include a recently created object, and thus an object may not be renamed, leading to incomplete or incorrect results. Solutions such as EMRFS [1] from Amazon, S3mper [4] from Netflix, and S3Guard [2], attempt to overcome eventual consistency by requiring additional strongly consistent data storage. These solutions require multiple storage systems, are costly, and can introduce issues of consistency between the stores. Current object storage connectors from the Hadoop community are also notorious for their poor performance for write workloads. This, too, stems from their use of the rename operation, which is not a native object storage operation; not only is it not atomic, but it must be implemented using a costly copy operation, followed by delete. Others have tried to improve the performance of object storage connectors by eliminating rename, e.g., the Direct-ParquetOutputCommitter [5] for S3a introduced by Databricks, but have failed to preserve fault tolerance and speculation. Stocator takes advantage of object storage semantics to achieve both high performance and fault tolerance. It eliminates the rename paradigm by writing each output object to its final name. The name includes both the part number and the attempt number, so that multiple attempts to write the same part use different objects. Stocator proposes to extend an already existing success indicator object written at the end of a Spark job, to include a manifest with the names of all the objects that compose the final output; this ensures that","PeriodicalId":20679,"journal":{"name":"Proceedings of the 2017 Symposium on Cloud Computing","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83221819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2