Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering最新文献

筛选
英文 中文
HHVM Performance Optimization for Large Scale Web Services 面向大规模Web服务的HHVM性能优化
Yuhao Li, Abhishek Gupta, Alex Yang, Peinan Chen, Joey Pinto, B. Karrer, Mayank Pundir, M. Balandat, A. Kejariwal, Benjamin Lee
{"title":"HHVM Performance Optimization for Large Scale Web Services","authors":"Yuhao Li, Abhishek Gupta, Alex Yang, Peinan Chen, Joey Pinto, B. Karrer, Mayank Pundir, M. Balandat, A. Kejariwal, Benjamin Lee","doi":"10.1145/3578244.3583720","DOIUrl":"https://doi.org/10.1145/3578244.3583720","url":null,"abstract":"HHVM is commonly developed for large online web services, yet there remains much room for optimizing HHVM performance. This paper discusses challenges and techniques in optimizing HHVM performance for Meta's web service. We begin by evaluating the effectiveness of semantic request routing, a request routing method aimed at enhancing code cache performance in HHVM, and examine its implications for optimizing HHVM performance. Second, we characterize HHVM performance for a large-scale datacenter and identify the challenges brought by uncontrollable confounding factors. Finally, we present the performance management framework for autotuning HHVM performance at scale.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"226 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116854925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Optimisation of Modern Software System Properties 现代软件系统属性的自动优化
Federica Sarro
{"title":"Automated Optimisation of Modern Software System Properties","authors":"Federica Sarro","doi":"10.1145/3578244.3583739","DOIUrl":"https://doi.org/10.1145/3578244.3583739","url":null,"abstract":"Realizing modern software systems poses new challenges to the software engineers: Users of applications running on limited capability devices still demand acceptable performance [2, 5, 13, 15]; users of systems relying on artificial intelligence to take decision (rightly) reclaim a fair treatment [4 , 7, 12]; users of social networking systems expect to be protected against malicious behaviours [1]. Moreover, AI-enabled software systems are so energy-greedy that their usage is causing an alarming surge in energy consumption with a significant increase in CO2 emissions [10]. Equipping software with appealing functionalities and minimising faults, is not enough if the emerging non-functional properties of these systems, such as fairness, safety and sustainability, are not taken into account. Mobile users will stop using an app if it is too slow or uses much bandwidth [5 , 13]. Human bias can be transferred to various real-word systems relying on ML: Bias has been found in advertisement, recruitment, admission processes [3 , 9 , 19], among others, and human rights [16]. A growing number of malicious users use well-intentioned software platforms as a tool to attack the innocent users with whom they share the platform. Examples of such harmful acts are sadly too many to list; they include bullying, harassment, hate speech, misinformation, election interference, scamming and spamming. ChatGTP is an AI model able to answer a variety of questions, compose essays, have philosophical conversations, and even code or fix bugs [18]. However, all these come at a high cost: ChatGPT has been estimated to consume the equivalent of the electricity consumed by 175,000 people in Denmark per month. In this keynote, I will discuss the necessity to take these properties into account when realizing these type of systems, and the extent to which it is possible to automate their optimization. I will discuss existing solutions mainly based, but not limited to, multi-objective optimisation [5, 6, 8, 10, 14 , 17]. In fact, we cannot expect that a software engineer, regardless of their level of expertise, would be able to manually find all opportunities for optimising these non-functional properties [11]. I will review research trends, presenting results from the SOLAR group and others. I will also discuss some directions for future work and open-challenges towards achieving better, fairer, safer and greener software.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121464297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Predicting the Performance of ATL Model Transformations 预测ATL模型转换的性能
Raffaela Groner, Peter Bellmann, S. Höppner, Patrick Thiam, F. Schwenker, Matthias Tichy
{"title":"Predicting the Performance of ATL Model Transformations","authors":"Raffaela Groner, Peter Bellmann, S. Höppner, Patrick Thiam, F. Schwenker, Matthias Tichy","doi":"10.1145/3578244.3583727","DOIUrl":"https://doi.org/10.1145/3578244.3583727","url":null,"abstract":"Model transformation languages are special-purpose languages, which are designed to define transformations as comfortably as possible, i.e., often in a declarative way. Typically, developers create their transformations based on small input models which systematically cover the language of the input models. This makes it difficult for the developers to estimate how the transformations would perform for a large and diverse set of input models. Hence, developers would benefit from an approach for predicting the performance of model transformations based on just abstract characteristics of input models. Regression approaches based on machine learning lend themselves well to such predictions. However, it is currently unknown, whether and which regression approach is suitable in this context as well as how a model should be abstractly characterized for this purpose. We conducted several experiments to analyze how well different machine learning methods predict the execution time of model transformations defined in the Atlas Transformation Language (ATL) transformations for distinct sets of model characteristics. As possible methods, we have investigated linear regression, random forests and support vector regression using a radial basis function kernel. The results of our experiments show that support vector regression is the best choice in terms of usability and prediction accuracy for the model transformation modules covered in our experiments and is thus suited for a prediction approach. In addition, simple model characterizations based only on the number of model elements, the number of references, and the number of attributes are a suitable way to easily describe a model and to achieve decent prediction accuracy.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130524641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementation of Dataflow Software Pipelining for Codelet Model 代码模型数据流软件流水线的实现
Siddhisanket Raskar, Jose M Monsalve Diaz, T. Applencourt, Kalyan Kumaran, Guangrong Gao
{"title":"Implementation of Dataflow Software Pipelining for Codelet Model","authors":"Siddhisanket Raskar, Jose M Monsalve Diaz, T. Applencourt, Kalyan Kumaran, Guangrong Gao","doi":"10.1145/3578244.3583734","DOIUrl":"https://doi.org/10.1145/3578244.3583734","url":null,"abstract":"Computer architectures have evolved from single core to chips with thousands of cores. Loop and instruction level parallelism techniques like software pipelining that are successful for single cores have limitations in the multi-core era. We extend the software pipelining technology beyond the limits of fine-grained, instruction-level parallelism. We accomplish this through dataflow software pipelining technology and its extension. Specifically, we present extensions to dataflow-based codelet model and its abstract machine to exploit pipelined parallelism across loops. We extend the runtime implementation of the codelet model with our proposed extensions to take advantage of dataflow software pipelining principles using efficient single-owner fifo buffer across Codelet's dependencies. We show promising improvements with the use of dataflow software pipelining techniques by performing an in-depth case study of Cannon's algorithm for matrix multiplication.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128854643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Packet-Level Analysis of Zoom Performance Anomalies 变焦性能异常的包级分析
Mehdi Karamollahi, C. Williamson, M. Arlitt
{"title":"Packet-Level Analysis of Zoom Performance Anomalies","authors":"Mehdi Karamollahi, C. Williamson, M. Arlitt","doi":"10.1145/3578244.3583725","DOIUrl":"https://doi.org/10.1145/3578244.3583725","url":null,"abstract":"In this paper, we use Wireshark packet-level traces to study the performance of the Zoom network application. Our work is motivated by several anecdotal reports of Zoom performance problems on our campus network during the Fall 2021 semester. Through the collection and analysis of Wireshark traces from different vantage points, we are able to pinpoint the root cause of the Zoom performance problems, which is a congested external Internet link for our campus network. We also identify several characteristics of the Zoom application that exacerbate its performance issues on congested and lossy networks, due to multi-layer protocol interactions.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127815241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Lightweight Kubernetes Distributions: A Performance Comparison of MicroK8s, k3s, k0s, and Microshift 轻量级Kubernetes发行版:MicroK8s、k3s、k0s和Microshift的性能比较
H. Koziolek, Nafise Eskandani
{"title":"Lightweight Kubernetes Distributions: A Performance Comparison of MicroK8s, k3s, k0s, and Microshift","authors":"H. Koziolek, Nafise Eskandani","doi":"10.1145/3578244.3583737","DOIUrl":"https://doi.org/10.1145/3578244.3583737","url":null,"abstract":"With containers becoming a prevalent method of software deployment, there is an increasing interest to use container orchestration frameworks not only in data centers, but also on resource-constrained hardware, such as Internet-of-Things devices, Edge gateways, or developer workstations. Consequently, software vendors have released several lightweight Kubernetes (K8s) distributions for container orchestration in the last few years, but it remains difficult for software developers to select an appropriate solution. Existing studies on lightweight K8s distribution performance tested only small workloads, showed inconclusive results, and did not cover recently released distributions. The contribution of this paper is a comparison of MicroK8s, k3s, k0s, and MicroShift, investigating their minimal resource usage as well as control plane and data plane performance in stress scenarios. While k3s and k0s showed by a small amount the highest control plane throughput and MicroShift showed the highest data plane throughput, usability, security, and maintainability are additional factors that drive the decision for an appropriate distribution.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131185515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Systematically Exploring High-Performance Representations of Vector Fields Through Compile-Time Composition 通过编译时合成系统地探索向量场的高性能表示
Stephen Nicholas Swatman, A. Varbanescu, A. Pimentel, A. Salzburger, A. Krasznahorkay
{"title":"Systematically Exploring High-Performance Representations of Vector Fields Through Compile-Time Composition","authors":"Stephen Nicholas Swatman, A. Varbanescu, A. Pimentel, A. Salzburger, A. Krasznahorkay","doi":"10.1145/3578244.3583723","DOIUrl":"https://doi.org/10.1145/3578244.3583723","url":null,"abstract":"We present a novel benchmark suite for implementations of vector fields in high-performance computing environments to aid developers in quantifying and ranking their performance. We decompose the design space of such benchmarks into access patterns and storage backends, the latter of which can be further decomposed into components with different functional and non-functional properties. Through compile-time meta-programming, we generate a large number of benchmarks with minimal effort and ensure the extensibility of our suite. Our empirical analysis, based on real-world applications in high-energy physics, demonstrates the feasibility of our approach on CPU and GPU platforms, and highlights that our suite is able to evaluate performance-critical design choices. Finally, we propose that our work towards composing vector fields from elementary components is not only useful for the purposes of benchmarking, but that it naturally gives rise to a novel library for implementing such fields in domain applications.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133261215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evaluating the Energy Measurements of the IBM POWER9 On-Chip Controller 评估IBM POWER9片上控制器的能量测量
Hannes Tröpgen, Mario Bielert, T. Ilsche
{"title":"Evaluating the Energy Measurements of the IBM POWER9 On-Chip Controller","authors":"Hannes Tröpgen, Mario Bielert, T. Ilsche","doi":"10.1145/3578244.3583729","DOIUrl":"https://doi.org/10.1145/3578244.3583729","url":null,"abstract":"Dependable power measurements are the backbone of energy-efficient computing systems. The IBM PowerNV platform offers such power measurements through an embedded PowerPC 405 processor: The On-Chip Controller (OCC). Among other system-control tasks, the OCC provides power measurements for several domains, such as system, CPU, and GPU. This paper provides a detailed description and an in-depth evaluation of these OCC-provided power measurements. For that, we describe the provided interfaces themselves and experimentally verify their overhead (3.6 µs to 10.8 µs per access) and readout rate (24.95 Sa/s). We also study the consistency of the reported sensor readouts across the measurement domains and compare it to externally measured data. Furthermore, we estimate the internal sampling rate (1996 Sa/s) by provoking aliasing errors with artificial workloads, and quantify the errors that such aliasing could introduce in practice (for power consumption of processors 12% in our experimental worst-case scenario). Given these insights, practitioners using the IBM PowerNV platform can assess the quality of the embedded measurements, permitting sought-after energy efficiency improvements.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116436634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DrGPU: A Top-Down Profiler for GPU Applications DrGPU:一个自顶向下的GPU应用分析器
Yueming Hao, Nikhil Jain, R. Van der Wijngaart, N. Saxena, Yuanbo Fan, Xu Liu
{"title":"DrGPU: A Top-Down Profiler for GPU Applications","authors":"Yueming Hao, Nikhil Jain, R. Van der Wijngaart, N. Saxena, Yuanbo Fan, Xu Liu","doi":"10.1145/3578244.3583736","DOIUrl":"https://doi.org/10.1145/3578244.3583736","url":null,"abstract":"GPUs have become common in HPC systems to accelerate scientific computing and machine learning applications. Efficiently mapping these applications to rapid evolutions of GPU architectures for high performance is a well-known challenge. Various performance inefficiencies exist in GPU kernels that impede applications from obtaining bare-metal performance. While existing tools are able to measure these inefficiencies, they mostly focus on data collection and presentation, requiring significant manual efforts to understand the root causes for actionable optimization. Thus, we develop DrGPU, a novel profiler that performs top-down analysis to guide GPU code optimization. As its salient feature, DrGPU leverages hardware performance counters available in commodity GPUs to quantify stall cycles, decompose them into various stall reasons, pinpoint root causes, and provide intuitive optimization guidance. With the help of DrGPU, we are able to analyze important GPU benchmarks and applications and obtain nontrivial speedups --- up to 1.77X on V100 and 2.03X on GTX 1650.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127113344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Is Sharing Caring? Analyzing the Incentives for Shared Cloud Clusters 分享是关怀吗?共享云集群的激励机制分析
Talha Mehboob, Noman Bashir, M. Zink, David E. Irwin
{"title":"Is Sharing Caring? Analyzing the Incentives for Shared Cloud Clusters","authors":"Talha Mehboob, Noman Bashir, M. Zink, David E. Irwin","doi":"10.1145/3578244.3583730","DOIUrl":"https://doi.org/10.1145/3578244.3583730","url":null,"abstract":"Many organizations maintain and operate large shared computing clusters, since they can substantially reduce computing costs by leveraging statistical multiplexing to amortize it across all users. Importantly, such shared clusters are generally not free to use, but have an internal pricing model that funds their operation. Since employees at many large organizations, especially Universities, have some budgetary autonomy over purchase decisions, internal shared clusters are increasingly competing for users with cloud platforms, which may offer lower costs and better performance. As a result, many organizations are shifting their shared clusters to operate on cloud resources. This paper empirically analyzes the user incentives for shared cloud clusters under two different pricing models using an 8-year job trace from a large shared cluster for a large University system. Our analysis shows that, with either pricing model, a large fraction of users have little financial incentive to participate in a shared cloud cluster compared to directly acquiring resources from a cloud platform. While shared cloud clusters can provide some limited reductions in cost by leveraging reserved instances at a discount, due to bursty workloads, realizing these reductions generally requires imposing long job waiting times, which for many users are likely not worth the cost reduction. In particular, we show that, assuming users defect from the shared cluster if their wait time is greater than 15x their average job runtime, over 80% of the users would defect, which increases the price of the remaining users such that it eliminates any incentive to participate in a shared cluster. Thus, while shared cloud clusters may provide users other benefits, their financial incentives are weak.","PeriodicalId":160204,"journal":{"name":"Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129546254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信