Stateful Serverless Computing with Crucial

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2022-03-07 DOI:10.1145/3490386

Daniel Barcelona Pons, P. Sutra, Marc Sánchez Artigas, Gerard París, P. López

{"title":"Stateful Serverless Computing with Crucial","authors":"Daniel Barcelona Pons, P. Sutra, Marc Sánchez Artigas, Gerard París, P. López","doi":"10.1145/3490386","DOIUrl":null,"url":null,"abstract":"Serverless computing greatly simplifies the use of cloud resources. In particular, Function-as-a-Service (FaaS) platforms enable programmers to develop applications as individual functions that can run and scale independently. Unfortunately, applications that require fine-grained support for mutable state and synchronization, such as machine learning (ML) and scientific computing, are notoriously hard to build with this new paradigm. In this work, we aim at bridging this gap. We present Crucial, a system to program highly-parallel stateful serverless applications. Crucial retains the simplicity of serverless computing. It is built upon the key insight that FaaS resembles to concurrent programming at the scale of a datacenter. Accordingly, a distributed shared memory layer is the natural answer to the needs for fine-grained state management and synchronization. Crucial allows to port effortlessly a multi-threaded code base to serverless, where it can benefit from the scalability and pay-per-use model of FaaS platforms. We validate Crucial with the help of micro-benchmarks and by considering various stateful applications. Beyond classical parallel tasks (e.g., a Monte Carlo simulation), these applications include representative ML algorithms such as k-means and logistic regression. Our evaluation shows that Crucial obtains superior or comparable performance to Apache Spark at similar cost (18%–40% faster). We also use Crucial to port (part of) a state-of-the-art multi-threaded ML library to serverless. The ported application is up to 30% faster than with a dedicated high-end server. Finally, we attest that Crucial can rival in performance with a single-machine, multi-threaded implementation of a complex coordination problem. Overall, Crucial delivers all these benefits with less than 6% of changes in the code bases of the evaluated applications.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"344 1","pages":"1 - 38"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology (TOSEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3490386","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

Abstract

Serverless computing greatly simplifies the use of cloud resources. In particular, Function-as-a-Service (FaaS) platforms enable programmers to develop applications as individual functions that can run and scale independently. Unfortunately, applications that require fine-grained support for mutable state and synchronization, such as machine learning (ML) and scientific computing, are notoriously hard to build with this new paradigm. In this work, we aim at bridging this gap. We present Crucial, a system to program highly-parallel stateful serverless applications. Crucial retains the simplicity of serverless computing. It is built upon the key insight that FaaS resembles to concurrent programming at the scale of a datacenter. Accordingly, a distributed shared memory layer is the natural answer to the needs for fine-grained state management and synchronization. Crucial allows to port effortlessly a multi-threaded code base to serverless, where it can benefit from the scalability and pay-per-use model of FaaS platforms. We validate Crucial with the help of micro-benchmarks and by considering various stateful applications. Beyond classical parallel tasks (e.g., a Monte Carlo simulation), these applications include representative ML algorithms such as k-means and logistic regression. Our evaluation shows that Crucial obtains superior or comparable performance to Apache Spark at similar cost (18%–40% faster). We also use Crucial to port (part of) a state-of-the-art multi-threaded ML library to serverless. The ported application is up to 30% faster than with a dedicated high-end server. Finally, we attest that Crucial can rival in performance with a single-machine, multi-threaded implementation of a complex coordination problem. Overall, Crucial delivers all these benefits with less than 6% of changes in the code bases of the evaluated applications.

查看原文本刊更多论文

有状态无服务器计算与关键

无服务器计算极大地简化了云资源的使用。特别是，功能即服务(FaaS)平台使程序员能够将应用程序开发为可以独立运行和扩展的独立功能。不幸的是，需要细粒度支持可变状态和同步的应用程序，比如机器学习(ML)和科学计算，很难用这种新范式构建。在这项工作中，我们的目标是弥合这一差距。我们提出了一个编程高度并行的无服务器状态应用程序的系统。Crucial保留了无服务器计算的简单性。它建立在FaaS类似于数据中心规模的并发编程的关键见解之上。因此，分布式共享内存层是满足细粒度状态管理和同步需求的自然答案。Crucial允许毫不费力地将多线程代码库移植到无服务器，从而可以从FaaS平台的可伸缩性和按使用付费模型中获益。我们在微基准测试的帮助下验证了Crucial，并考虑了各种有状态应用程序。除了经典的并行任务(例如，蒙特卡罗模拟)，这些应用程序包括代表性的ML算法，如k-means和逻辑回归。我们的评估表明，在相同的成本下，Crucial获得了优于或与Apache Spark相当的性能(快18%-40%)。我们还使用critical将最先进的多线程ML库(一部分)移植到无服务器上。移植后的应用程序比专用高端服务器快30%。最后，我们证明了Crucial在性能上可以与复杂协调问题的单机多线程实现相媲美。总的来说，在评估的应用程序的代码基础中，critical以少于6%的更改交付了所有这些好处。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Software Engineering and Methodology (TOSEM)

自引率

0.00%

发文量