Martin Straesser, Jonas Mathiasch, A. Bauer, Samuel Kounev
{"title":"A Systematic Approach for Benchmarking of Container Orchestration Frameworks","authors":"Martin Straesser, Jonas Mathiasch, A. Bauer, Samuel Kounev","doi":"10.1145/3578244.3583726","DOIUrl":"https://doi.org/10.1145/3578244.3583726","url":null,"abstract":"Container orchestration frameworks play a critical role in modern cloud computing paradigms such as cloud-native or serverless computing. They significantly impact the quality and cost of service deployment as they manage many performance-critical tasks such as container provisioning, scheduling, scaling, and networking. Consequently, a comprehensive performance assessment of container orchestration frameworks is essential. However, until now, there is no benchmarking approach that covers the many different tasks implemented in such platforms and supports evaluating different technology stacks. In this paper, we present a systematic approach that enables benchmarking of container orchestrators. Based on a definition of container orchestration, we define the core requirements and benchmarking scope for such platforms. Each requirement is then linked to metrics and measurement methods, and a benchmark architecture is proposed. With COFFEE, we introduce a benchmarking tool supporting the definition of complex test campaigns for container orchestration frameworks. We demonstrate the potential of our approach with case studies of the frameworks Kubernetes and Nomad in a self-hosted environment and on the Google Cloud Platform. The presented case studies focus on container startup times, crash recovery, rolling updates, and more.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122341610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Martin Straesser, Simon Eismann, Jóakim von Kistowski, A. Bauer, Samuel Kounev
{"title":"Autoscaler Evaluation and Configuration: A Practitioner's Guideline","authors":"Martin Straesser, Simon Eismann, Jóakim von Kistowski, A. Bauer, Samuel Kounev","doi":"10.1145/3578244.3583721","DOIUrl":"https://doi.org/10.1145/3578244.3583721","url":null,"abstract":"Autoscalers are indispensable parts of modern cloud deployments and determine the service quality and cost of a cloud application in dynamic workloads. The configuration of an autoscaler strongly influences its performance and is also one of the biggest challenges and showstoppers for the practical applicability of many research autoscalers. Many proposed cloud experiment methodologies can only be partially applied in practice, and many autoscaling papers use custom evaluation methods and metrics. This paper presents a practical guideline for obtaining meaningful and interpretable results on autoscaler performance with reasonable overhead. We provide step-by-step instructions for defining realistic usage behaviors and traffic patterns. We divide the analysis of autoscaler performance into a qualitative antipattern-based analysis and a quantitative analysis. To demonstrate the applicability of our guideline, we conduct several experiments with a microservice of our industry partner in a realistic test environment.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126435878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pushing the Limits of Video Game Performance: A Performance Engineering Perspective","authors":"Mathieu Nayrolles","doi":"10.1145/3578244.3583738","DOIUrl":"https://doi.org/10.1145/3578244.3583738","url":null,"abstract":"Ubisoft constantly pushes the boundaries of game development to create immersive worlds that capture the imagination of millions of players worldwide. To achieve this, performance engineering plays a crucial role in ensuring that games run smoothly on various platforms and devices. In this talk, we will explore the latest advancements in the field of performance engineering for video games, focusing on runtime performance, network optimization, backend and database optimization, and cloud gaming. We will discuss how machine learning techniques enhance classical profiling and optimize game engine scheduling. Additionally, we will address the challenges of deterministic replication of assets between clients and optimizing micro-services for cloud gaming experiences. Lastly, we will touch on the importance of performance engineering for non-code aspects of game development, such as animation, textures, props, and assets.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131335967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application Knowledge Required: Performance Modeling for Fun and Profit","authors":"G. Hager","doi":"10.1145/3578244.3585384","DOIUrl":"https://doi.org/10.1145/3578244.3585384","url":null,"abstract":"In High Performance Computing, resource efficiency is paramount. Expensive systems need to be utilized to the maximum of their capabilities, but deep insight into the bottlenecks of a particular hardware-software combination is often lacking on the users' side. Analytic, first-principles performance models can provide such insight. They are built on simplified descriptions of the machine, the software, and how they interact. This goes, to some extent, against the general trend towards automation in computer science; the individual conducting the analysis does require some knowledge of the application and the hardware in order to make performance engineering a scientific process instead of blindly generating data with tools that are poorly understood. This talk uses examples from parallel high-performance computing to demonstrate how analytic performance models can support scientific thinking in performance engineering: Sparse matrix-vector multiplication, the HPCG benchmark, the CloverLeaf proxy app, and a lattice-Boltzmann solver. Interestingly, the most intriguing insights emerge from the failure of analytic models to accurately predict performance measurements.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128292117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting Inference Latency of Neural Architectures on Mobile Devices","authors":"Zhuojin Li, Marco Paolieri, L. Golubchik","doi":"10.1145/3578244.3583735","DOIUrl":"https://doi.org/10.1145/3578244.3583735","url":null,"abstract":"Due to the proliferation of inference tasks on mobile devices, state-of-the-art neural architectures are typically designed using Neural Architecture Search (NAS) to achieve good tradeoffs between machine learning accuracy and inference latency. While measuring inference latency of a huge set of candidate architectures during NAS is not feasible, latency prediction for mobile devices is challenging, because of hardware heterogeneity, optimizations applied by machine learning frameworks, and diversity of neural architectures. Motivated by these challenges, we first quantitatively assess the characteristics of neural architectures and mobile devices that have significant effects on inference latency. Based on this assessment, we propose an operation-wise framework which addresses these challenges by developing operation-wise latency predictors and achieves high accuracy in end-to-end latency predictions, as shown by our comprehensive evaluations on multiple mobile devices using multicore CPUs and GPUs. To illustrate that our approach does not require expensive data collection, we also show that accurate predictions can be achieved on real-world neural architectures using only small amounts of profiling data.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131899351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simon Volpert, Benjamin Erb, G. Eisenhart, Daniel Seybold, S. Wesner, Jörg Domaschka
{"title":"A Methodology and Framework to Determine the Isolation Capabilities of Virtualisation Technologies","authors":"Simon Volpert, Benjamin Erb, G. Eisenhart, Daniel Seybold, S. Wesner, Jörg Domaschka","doi":"10.1145/3578244.3583728","DOIUrl":"https://doi.org/10.1145/3578244.3583728","url":null,"abstract":"The capability to isolate system resources is an essential characteristic of virtualisation technologies and is therefore important for research and industry alike. It allows the co-location of experiments and workloads, the partitioning of system resources and enables multi-tenant business models such as cloud computing. Poor isolation among tenants bears the risk of noisy-neighbour and contention effects which negatively impacts all of those use-cases. These effects describe the negative impact of one tenant onto another by utilising shared resources. Both industry and research provide many different concepts and technologies to realise isolation. Yet, the isolation capabilities of all these different approaches are not well understood; nor is there an established way to measure the quality of their isolation capabilities. Such an understanding, however, is of uttermost importance in practice to elaborately decide on a suited implementation. Hence, in this work, we present a novel methodology to measure the isolation capabilities of virtualisation technologies for system resources, that fulfils all requirements to benchmarking including reliability. It relies on an immutable approach, based on Experiment-as-Code. The complete process holistically includes everything from bare metal resource provisioning to the actual experiment enactment. The results determined by this methodology help in the decision for a virtualisation technology regarding its capability to isolate given resources. Such results are presented here as a closing example in order to validate the proposed methodology.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127319133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analyzing the Performance of SD-WAN Enabled Service Function Chains Across the Globe with AWS","authors":"Aris Leivadeas, Nikolai Pitaev, M. Falkner","doi":"10.1145/3578244.3583722","DOIUrl":"https://doi.org/10.1145/3578244.3583722","url":null,"abstract":"Cloud Computing has revolutionized the information technology world and the application offering over the last two decades. At the same time recent trends in Network Function Virtualization (NFV) and Software-Defined Wide Area Networks (SD-WAN) and the combination of those with the Cloud paradigm has allowed an unprecedented shift of enterprise networking services towards the Public Cloud. Even though this network evolutionary approach brings many benefits, it still presents many drawbacks as well. The performance stability and service continuity over a black box Public Cloud infrastructure can hinder the formal service guarantees that many new emerging applications may have. To this end, in this paper, we aim to shed light on the overall performance achieved when deploying coast-to-coast and intercontinental Service Function Chains (SFCs) that interconnect geographically distributed enterprise branches over the Amazon Web Services (AWS) infrastructure. In particular, we investigate the impact of region, Virtual Machine (VM) instance, time of the day and day of the week in the overall throughput and delay attained. The obtained results show the strengths and weaknesses of entirely relying on the AWS infrastructure to offer networking services by investigating possible hidden performance bottlenecks.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"198 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121748760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jasper A. Hasenoot, Jan S. Rellermeyer, Alexandru Uta
{"title":"The Performance of Distributed Applications: A Traffic Shaping Perspective","authors":"Jasper A. Hasenoot, Jan S. Rellermeyer, Alexandru Uta","doi":"10.1145/3578244.3583733","DOIUrl":"https://doi.org/10.1145/3578244.3583733","url":null,"abstract":"Widely used in datacenters and clouds, network traffic shaping is a performance influencing factor that is often overlooked when benchmarking or simply deploying distributed applications. While in theory traffic shaping should allow for a fairer sharing of network resources, in practice it also introduces new problems: performance (measurement) inconsistency and long tails. In this paper we investigate the effects of traffic shaping mechanisms on common distributed applications. We characterize the performance of a distributed key-value store, big data workloads, and high-performance computing under state-of-the-art benchmarks, while the underlying network's traffic is shaped using state-of-the-art mechanisms such as token-buckets or priority queues. Our results show that the impact of traffic shaping needs to be taken into account when benchmarking or deploying distributed applications. To help researchers, practitioners, and application developers we uncover several practical implications and make recommendations on how certain applications are to be deployed so that performance is least impacted by the shaping protocols.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130947922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leonardo Passig Horstmann, Matheus Wagner, A. A. Fröhlich
{"title":"A Method to Evaluate the Performance of Predictors in Cyber-Physical Systems","authors":"Leonardo Passig Horstmann, Matheus Wagner, A. A. Fröhlich","doi":"10.1145/3578244.3583732","DOIUrl":"https://doi.org/10.1145/3578244.3583732","url":null,"abstract":"Cyber-Physical Systems (CPS) rely on sensing to control and optimize their operation. Nevertheless, sensing itself is prone to errors that can originate at several stages, from sampling to communication. In this context, several systems adopt multivariate predictors to assess the quality of the sensed data, to replace data from faulty sensors, or to derive variables that cannot be directly sensed. These predictors are often evaluated based on their accuracy and computing demands, however, such evaluations often do not consider the system's architecture from a broader perspective, ignoring the way components are interconnected and how they cascade as inputs of other Machine Learning (ML) models. In this work, we introduce a method to evaluate the performance of interdependent predictors based on the stability of the estimation error dynamics in faulty scenarios. The proposed method estimates the ability of a predictor to produce accurate predictions while accounting for the impacts of cascading predicted values as its inputs. The prediction correctness is estimated based solely on information acquired during the training of the multivariate predictors and mathematical properties of the ML activation functions. The proposed method is evaluated with a meaningful dataset in the scope of monitoring and control of a Cyber-Physical System, and the evaluation demonstrates the ability of the proposed method to account for the interdependence of data predictors.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122137731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Cengiz, M. Forshaw, Amir Atapour-Abarghouei, A. McGough
{"title":"Predicting the Performance of a Computing System with Deep Networks","authors":"M. Cengiz, M. Forshaw, Amir Atapour-Abarghouei, A. McGough","doi":"10.1145/3578244.3583731","DOIUrl":"https://doi.org/10.1145/3578244.3583731","url":null,"abstract":"Predicting the performance and energy consumption of computing hardware is critical for many modern applications. This will inform procurement decisions, deployment decisions, and autonomic scaling. Existing approaches to understanding the performance of hardware largely focus around benchmarking -- leveraging standardised workloads which seek to be representative of an end-user's needs. Two key challenges are present; benchmark workloads may not be representative of an end-user's workload, and benchmark scores are not easily obtained for all hardware. Within this paper, we demonstrate the potential to build Deep Learning models to predict benchmark scores for unseen hardware. We undertake our evaluation with the openly available SPEC 2017 benchmark results. We evaluate three different networks, one fully-connected network along with two Convolutional Neural Networks (one bespoke and one ResNet inspired) and demonstrate impressive R2 scores of 0.96, 0.98 and 0.94 respectively.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127391202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}