{"title":"HARDLESS: A Generalized Serverless Compute Architecture for Hardware Processing Accelerators","authors":"Sebastian Werner, Trever Schirmer","doi":"10.1109/IC2E55432.2022.00016","DOIUrl":"https://doi.org/10.1109/IC2E55432.2022.00016","url":null,"abstract":"The increasing use of hardware processing accelerators tailored for specific applications, such as the Vision Processing Unit (VPU) for image recognition, further increases developers' configuration, development, and management over-head. Developers have successfully used fully automated elastic cloud services such as serverless computing to counter these additional efforts and shorten development cycles for applications running on CPUs. Unfortunately, current cloud solutions do not yet provide these simplifications for applications that require hardware acceleration. However, as the development of special-ized hardware acceleration continues to provide performance and cost improvements, it will become increasingly important to enable ease of use in the cloud. In this paper, we present an initial design and implemen-tation of Hardless, an extensible and generalized serverless computing architecture that can support workloads for arbitrary hardware accelerators. We show how Hardless can scale across different commodity hardware accelerators and support a variety of workloads using the same execution and programming model common in serverless computing today.","PeriodicalId":415781,"journal":{"name":"2022 IEEE International Conference on Cloud Engineering (IC2E)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127898169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Houkun Zhu, Dominik Scheinert, L. Thamsen, Kordian Gontarska, O. Kao
{"title":"Magpie: Automatically Tuning Static Parameters for Distributed File Systems using Deep Reinforcement Learning","authors":"Houkun Zhu, Dominik Scheinert, L. Thamsen, Kordian Gontarska, O. Kao","doi":"10.1109/IC2E55432.2022.00023","DOIUrl":"https://doi.org/10.1109/IC2E55432.2022.00023","url":null,"abstract":"Distributed file systems are widely used nowadays, yet using their default configurations is often not optimal. At the same time, tuning configuration parameters is typically challenging and time-consuming. It demands expertise and tuning operations can also be expensive. This is especially the case for static parameters, where changes take effect only after a restart of the system or workloads. We propose a novel approach, Magpie, which utilizes deep re-inforcement learning to tune static parameters by strategically ex-ploring and exploiting configuration parameter spaces. To boost the tuning of the static parameters, our method employs both server and client metrics of distributed file systems to understand the relationship between static parameters and performance. Our empirical evaluation results show that Magpie can noticeably improve the performance of the distributed file system Lustre, where our approach on average achieves 91.8 % throughput gains against default configuration after tuning towards single performance indicator optimization, while it reaches 39.7% more throughput gains against the baseline.","PeriodicalId":415781,"journal":{"name":"2022 IEEE International Conference on Cloud Engineering (IC2E)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124725191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonathan Will, L. Thamsen, Jonathan Bader, Dominik Scheinert, O. Kao
{"title":"Get Your Memory Right: The Crispy Resource Allocation Assistant for Large-Scale Data Processing","authors":"Jonathan Will, L. Thamsen, Jonathan Bader, Dominik Scheinert, O. Kao","doi":"10.1109/IC2E55432.2022.00014","DOIUrl":"https://doi.org/10.1109/IC2E55432.2022.00014","url":null,"abstract":"Distributed dataflow systems like Apache Spark and Apache Hadoop enable data-parallel processing of large datasets on clusters. Yet, selecting appropriate computational resources for dataflow jobs — that neither lead to bottlenecks nor to low resource utilization — is often challenging, even for expert users such as data engineers. Further, existing automated approaches to resource selection rely on the assumption that a job is recurring to learn from previous runs or to warrant the cost of full test runs to learn from. However, this assumption often does not hold since many jobs are too unique. Therefore, we present Crispy, a method for optimizing data processing cluster configurations based on job profiling runs with small samples of the dataset on just a single machine. Crispy attempts to extrapolate the memory usage for the full dataset to then choose a cluster configuration with enough total memory. In our evaluation on a dataset with 1031 Spark and Hadoop jobs, we see a reduction of job execution costs by 56% compared to the baseline, while on average spending less than ten minutes on profiling runs per job on a consumer-grade laptop.","PeriodicalId":415781,"journal":{"name":"2022 IEEE International Conference on Cloud Engineering (IC2E)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121971103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Trever Schirmer, Joel Scheuner, Tobias Pfandzelter, David Bermbach
{"title":"Fusionize: Improving Serverless Application Performance through Feedback-Driven Function Fusion","authors":"Trever Schirmer, Joel Scheuner, Tobias Pfandzelter, David Bermbach","doi":"10.1109/IC2E55432.2022.00017","DOIUrl":"https://doi.org/10.1109/IC2E55432.2022.00017","url":null,"abstract":"Serverless computing increases developer productivity by removing operational concerns such as managing hardware or software runtimes. Developers, however, still need to partition their application into functions, which can be error-prone and adds complexity: Using a small function size where only the smallest logical unit of an application is inside a function maximizes flexibility and reusability. Yet, having small functions leads to invocation overheads, additional cold starts, and may increase cost due to double billing during synchronous invocations. In this paper we present Fusionize, a framework that removes these concerns from developers by automatically fusing the application code into a multi-function orchestration with varying function size. Developers only need to write the application code following a lightweight programming model and do not need to worry how the application is turned into functions. Our framework automatically fuses different parts of the application into functions and manages their interactions. Leveraging monitoring data, the framework optimizes the distribution of application parts to functions to optimize deployment goals such as end-to-end latency and cost. Using two example applications, we show that Fusionizecan automatically and iteratively improve the deployment artifacts of the application.","PeriodicalId":415781,"journal":{"name":"2022 IEEE International Conference on Cloud Engineering (IC2E)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114406883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tobias Pfandzelter, S. Henning, Trever Schirmer, W. Hasselbring, David Bermbach
{"title":"Streaming vs. Functions: A Cost Perspective on Cloud Event Processing","authors":"Tobias Pfandzelter, S. Henning, Trever Schirmer, W. Hasselbring, David Bermbach","doi":"10.1109/IC2E55432.2022.00015","DOIUrl":"https://doi.org/10.1109/IC2E55432.2022.00015","url":null,"abstract":"In cloud event processing, data generated at the edge is processed in real-time by cloud resources. Both distributed stream processing (DSP) and Function-as-a-Service (FaaS) have been proposed to implement such event processing applications. FaaS emphasizes fast development and easy operation, while DSP emphasizes efficient handling of large data volumes. Despite their architectural differences, both can be used to model and implement loosely-coupled job graphs. In this paper, we consider the selection of FaaS and DSP from a cost perspective. We implement stateless and stateful workflows from the Theodolite benchmarking suite using cloud FaaS and DSP. In an extensive evaluation, we show how application type, cloud service provider, and runtime environment can influence the cost of application deployments and derive decision guidelines for cloud engineers.","PeriodicalId":415781,"journal":{"name":"2022 IEEE International Conference on Cloud Engineering (IC2E)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133058232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}