Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)最新文献_第6页

Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications Faa$T:用于无服务器应用程序的透明自动缩放缓存

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference) Pub Date : 2021-04-28 DOI: 10.1145/3472883.3486974

Francisco Romero, G. Chaudhry, Íñigo Goiri, Pragna Gopa, Paul Batum, N. Yadwadkar, R. Fonseca, C. Kozyrakis, R. Bianchini

{"title":"Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications","authors":"Francisco Romero, G. Chaudhry, Íñigo Goiri, Pragna Gopa, Paul Batum, N. Yadwadkar, R. Fonseca, C. Kozyrakis, R. Bianchini","doi":"10.1145/3472883.3486974","DOIUrl":"https://doi.org/10.1145/3472883.3486974","url":null,"abstract":"Function-as-a-Service (FaaS) has become an increasingly popular way for users to deploy their applications without the burden of managing the underlying infrastructure. However, existing FaaS platforms rely on remote storage to maintain state, limiting the set of applications that can be run efficiently. Recent caching work for FaaS platforms has tried to address this problem, but has fallen short: it disregards the widely different characteristics of FaaS applications, does not scale the cache based on data access patterns, or requires changes to applications. To address these limitations, we present Faa$T, a transparent auto-scaling distributed cache for serverless applications. Each application gets its own cache. After a function executes and the application becomes inactive, the cache is unloaded from memory with the application. Upon reloading for the next invocation, Faa$T pre-warms the cache with objects likely to be accessed. In addition to traditional compute-based scaling, Faa$T scales based on working set and object sizes to manage cache space and I/O bandwidth. We motivate our design with a comprehensive study of data access patterns on Azure Functions. We implement Faa$T for Azure Functions, and show that Faa$T can improve performance by up to 92% (57% on average) for challenging applications, and reduce cost for most users compared to state-of-the-art caching systems, i.e. the cost of having to stand up additional serverful resources.","PeriodicalId":91949,"journal":{"name":"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)","volume":"171 1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76005798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 56

Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines Llama:用于自动调整视频分析管道的异构和无服务器框架

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference) Pub Date : 2021-02-03 DOI: 10.1145/3472883.3486972

Francisco Romero, Mark Zhao, N. Yadwadkar, C. Kozyrakis

{"title":"Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines","authors":"Francisco Romero, Mark Zhao, N. Yadwadkar, C. Kozyrakis","doi":"10.1145/3472883.3486972","DOIUrl":"https://doi.org/10.1145/3472883.3486972","url":null,"abstract":"The proliferation of camera-enabled devices and large video repositories has led to a diverse set of video analytics applications. These applications rely on video pipelines, represented as DAGs of operations, to transform videos, process extracted metadata, and answer questions like, \"Is this intersection congested?\" The latency and resource efficiency of pipelines can be optimized using configurable knobs for each operation (e.g., sampling rate, batch size, or type of hardware used). However, determining efficient configurations is challenging because (a) the configuration search space is exponentially large, and (b) the optimal configuration depends on users' desired latency and cost targets, (c) input video contents may exercise different paths in the DAG and produce a variable amount intermediate results. Existing video analytics and processing systems leave it to the users to manually configure operations and select hardware resources. We present Llama: a heterogeneous and serverless framework for auto-tuning video pipelines. Given an end-to-end latency target, Llama optimizes for cost efficiency by (a) calculating a latency target for each operation invocation, and (b) dynamically running a cost-based optimizer to assign configurations across heterogeneous hardware that best meet the calculated per-invocation latency target. This makes the problem of auto-tuning large video pipelines tractable and allows us to handle input-dependent behavior, conditional branches in the DAG, and execution variability. We describe the algorithms in Llama and evaluate it on a cloud platform using serverless CPU and GPU resources. We show that compared to state-of-the-art cluster and serverless video analytics and processing systems, Llama achieves 7.8x lower latency and 16x cost reduction on average.","PeriodicalId":91949,"journal":{"name":"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)","volume":"113 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76727494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 50

SoCC '21: ACM Symposium on Cloud Computing, Seattle, WA, USA, November 1 - 4, 2021 21: ACM云计算研讨会，美国西雅图，华盛顿州，2021年11月1 - 4日

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference) Pub Date : 2021-01-01 DOI: 10.1145/3472883

引用次数: 0

SoCC '20: ACM Symposium on Cloud Computing, Virtual Event, USA, October 19-21, 2020 ACM云计算研讨会，虚拟事件，美国，2020年10月19-21日

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference) Pub Date : 2020-01-01 DOI: 10.1145/3419111

引用次数: 1

Grasper 抓紧器

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference) Pub Date : 2019-11-20 DOI: 10.1145/3357223.3362715

Hongzhi Chen, Changji Li, Juncheng Fang, Chenghuan Huang, James Cheng, Jian Zhang, Yifan Hou, Xiao Yan

引用次数: 2

Proceedings of the ACM Symposium on Cloud Computing ACM云计算研讨会论文集

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference) Pub Date : 2019-11-20 DOI: 10.1145/3357223

引用次数: 3

Pufferfish: Container-driven Elastic Memory Management for Data-intensive Applications Pufferfish:用于数据密集型应用程序的容器驱动弹性内存管理

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference) Pub Date : 2019-11-20 DOI: 10.1145/3357223.3362730

Wei Chen, Aidi Pi, Shaoqi Wang, Xiaobo Zhou

{"title":"Pufferfish: Container-driven Elastic Memory Management for Data-intensive Applications","authors":"Wei Chen, Aidi Pi, Shaoqi Wang, Xiaobo Zhou","doi":"10.1145/3357223.3362730","DOIUrl":"https://doi.org/10.1145/3357223.3362730","url":null,"abstract":"Data-intensive applications often suffer from significant memory pressure, resulting in excessive garbage collection (GC) and out-of-memory (OOM) errors, harming system performance and reliability. In this paper, we demonstrate how lightweight virtualization via OS containers opens up opportunities to address memory pressure and realize memory elasticity: 1) tasks running in a container can be set to a large heap size to avoid OutOfMemory (OOM) errors, and 2) tasks that are under memory pressure and incur significant swapping activities can be temporarily \"suspended\" by depriving resources from the hosting containers, and be \"resumed\" when resources are available. We propose and develop Pufferfish, an elastic memory manager, that leverages containers to flexibly allocate memory for tasks. Memory elasticity achieved by Pufferfish can be exploited by a cluster scheduler to improve cluster utilization and task parallelism. We implement Pufferfish on the cluster scheduler Apache Yarn. Experiments with Spark and MapReduce on real-world traces show Pufferfish is able to avoid OOM errors, improve cluster memory utilization by 2.7x and the median job runtime by 5.5x compared to a memory over-provisioning solution.","PeriodicalId":91949,"journal":{"name":"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)","volume":"67 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85571972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Software Data Planes: You Can't Always Spin to Win 软件数据平面:你不能总是为了赢而旋转

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference) Pub Date : 2019-11-20 DOI: 10.1145/3357223.3362737

Hossein Golestani, Amirhossein Mirhosseini, T. Wenisch

{"title":"Software Data Planes: You Can't Always Spin to Win","authors":"Hossein Golestani, Amirhossein Mirhosseini, T. Wenisch","doi":"10.1145/3357223.3362737","DOIUrl":"https://doi.org/10.1145/3357223.3362737","url":null,"abstract":"Today's datacenters demand high-performance, energy-efficient software data planes, which are widely used in many areas including fast network packet processing, network function virtualization, high-speed data transfer in storage systems, and I/O virtualization. Modern software data planes bypass OS I/O stacks and rely on cores spinning on user-level queues as a fast notification mechanism. Whereas spin-polling can improve latency and throughput, it entails significant shortcomings, especially when scaling to large numbers of cores/queues. In this paper, we pinpoint and quantify challenges of spin-polling--based software data planes using Intel's Data Plane Development Kit (DPDK) as a representative infrastructure. We characterize four scalability issues of software data planes: (1) Full-tilt spinning cores perform more (useless) polling work when there is less work pending in the queues; (2) Spin-polling scales poorly with the number of polled queues due to processor cache capacity constraints, especially when traffic is unbalanced; (3) Operation rate limits (transactions per second) as well as a Polling Tax (the overhead of polling, which is considerable even when operating at saturation throughput) result in poor core scalability. (4) Whereas shared queues can mitigate load imbalance and head-of-line-blocking, synchronization overheads limit their potential benefits. We identify root causes of these issues and discuss solution directions to improve hardware and software abstractions for better performance, efficiency, and scalability in software data planes.","PeriodicalId":91949,"journal":{"name":"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81184363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

An automated, cross-layer instrumentation framework for diagnosing performance problems in distributed applications 用于诊断分布式应用程序中的性能问题的自动化跨层检测框架

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference) Pub Date : 2019-11-20 DOI: 10.1145/3357223.3362704

E. Ates, Lily Sturmann, Mert Toslali, O. Krieger, Richard Megginson, A. Coskun, Raja R. Sambasivan

引用次数: 20

Cirrus

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference) Pub Date : 2019-11-20 DOI: 10.1145/3357223.3362711

João Carreira, P. Fonseca, A. Tumanov, Andrew Zhang, R. Katz

{"title":"Cirrus","authors":"João Carreira, P. Fonseca, A. Tumanov, Andrew Zhang, R. Katz","doi":"10.1145/3357223.3362711","DOIUrl":"https://doi.org/10.1145/3357223.3362711","url":null,"abstract":"Machine learning (ML) workflows are extremely complex. The typical workflow consists of distinct stages of user interaction, such as preprocessing, training, and tuning, that are repeatedly executed by users but have heterogeneous computational requirements. This complexity makes it challenging for ML users to correctly provision and manage resources and, in practice, constitutes a significant burden that frequently causes over-provisioning and impairs user productivity. Serverless computing is a compelling model to address the resource management problem, in general, but there are numerous challenges to adopt it for existing ML frameworks due to significant restrictions on local resources. This work proposes Cirrus---an ML framework that automates the end-to-end management of datacenter resources for ML workflows by efficiently taking advantage of serverless infrastructures. Cirrus combines the simplicity of the serverless interface and the scalability of the serverless infrastructure (AWS Lambdas and S3) to minimize user effort. We show a design specialized for both serverless computation and iterative ML training is needed for robust and efficient ML training on serverless infrastructure. Our evaluation shows that Cirrus outperforms frameworks specialized along a single dimension: Cirrus is 100x faster than a general purpose serverless system [36] and 3.75x faster than specialized ML frameworks for traditional infrastructures [49].","PeriodicalId":91949,"journal":{"name":"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)","volume":"99 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78117532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 35