Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems最新文献_第3页

Offline and Online Algorithms for SSD Management SSD管理的离线和在线算法

Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems Pub Date : 2022-06-06 DOI: 10.1145/3489048.3522630

Tomer Lange, J. Naor, G. Yadgar

引用次数: 1

Fusing Speed Index during Web Page Loading 在网页加载过程中融合速度索引

Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems Pub Date : 2022-06-06 DOI: 10.1145/3489048.3522663

Wei Liu, Xinlei Yang, Hao Lin, Zhenhua Li, Feng Qian

{"title":"Fusing Speed Index during Web Page Loading","authors":"Wei Liu, Xinlei Yang, Hao Lin, Zhenhua Li, Feng Qian","doi":"10.1145/3489048.3522663","DOIUrl":"https://doi.org/10.1145/3489048.3522663","url":null,"abstract":"With conventional web page load metrics (e.g., Page Load Time) being blamed for deviating from actual user experiences, in recent years a more sensible and complex metric called Speed Index (SI) has been widely adopted to measure the user's quality of experience (QoE). In brief, SI indicates how quickly a page is filled up with above-the-fold visible elements (or crucial elements for short). To date, however, SI has been used as a metric for performance evaluation, rather than as an explicit heuristic to improve page loading. To demystify this, we examine the entire loading process of various pages and ascribe such incapability to three-fold fundamental uncertainties in terms of network, browser execution, and viewport size. In this paper, we design SipLoader, an SI-oriented page load scheduler through a novel cumulative reactive scheduling framework. It does not attempt to deal with uncertainties in advance or in one shot, but schedules page loading by \"repairing\" the anticipated (nearly) SI-optimal scheduling when uncertainties actually occur. This is achieved with a suite of efficient designs that fully exploit the cumulative nature of SI calculation. Evaluations show that SipLoader improves the median SI by 41%, and provides 1.43 times to 1.99 times more benefits than state-of-the-art solutions.","PeriodicalId":264598,"journal":{"name":"Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132511215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

WISEFUSE: Workload Characterization and DAG Transformation for Serverless Workflows WISEFUSE:无服务器工作流的工作负载表征和DAG转换

Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems Pub Date : 2022-06-06 DOI: 10.1145/3489048.3530959

Ashraf Y. Mahgoub, E. Yi, Karthick Shankar, Eshaan Minocha, S. Elnikety, S. Bagchi, S. Chaterji

{"title":"WISEFUSE: Workload Characterization and DAG Transformation for Serverless Workflows","authors":"Ashraf Y. Mahgoub, E. Yi, Karthick Shankar, Eshaan Minocha, S. Elnikety, S. Bagchi, S. Chaterji","doi":"10.1145/3489048.3530959","DOIUrl":"https://doi.org/10.1145/3489048.3530959","url":null,"abstract":"We characterize production workloads of serverless DAGs at a major cloud provider. Our analysis highlights two major factors that limit performance: (a) lack of efficient communication methods between the serverless functions in the DAG, and (b) stragglers when a DAG stage invokes a set of parallel functions that must complete before starting the next DAG stage. To address these limitations, we propose WISEFUSE, an automated approach to generate an optimized execution plan for serverless DAGs for a user-specified latency objective or $ budget. We introduce three optimizations: (1) Fusion combines in-series functions together in a single VM to reduce the communication overhead between cascaded functions. (2) Bundling executes a group of parallel invocations of a function in one VM to improve resource sharing among the parallel workers to reduce skew. (3) Resource Allocation assigns the right VM size to each function or function bundle in the DAG to reduce the E2E latency and cost. We implement WISEFUSE to evaluate it experimentally using three popular serverless applications with different DAG structures, memory footprints, and intermediate data sizes. Compared to competing approaches and other alternatives, WISEFUSE shows significant improvements in E2E latency and cost. Specifically, for a machine learning pipeline, WISEFUSE achieves P95 latency that is 67% lower than Photons, 39% lower than Faastlane, and 90% lower than SONIC without increasing the $ cost.","PeriodicalId":264598,"journal":{"name":"Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114535360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Dremel: Adaptive Configuration Tuning of RocksDB KV-Store Dremel: RocksDB KV-Store自适应配置调优

Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems Pub Date : 2022-06-06 DOI: 10.1145/3489048.3530970

Chenxingyu Zhao, Tapan Chugh, Jaehong Min, Ming G. Liu, A. Krishnamurthy

{"title":"Dremel: Adaptive Configuration Tuning of RocksDB KV-Store","authors":"Chenxingyu Zhao, Tapan Chugh, Jaehong Min, Ming G. Liu, A. Krishnamurthy","doi":"10.1145/3489048.3530970","DOIUrl":"https://doi.org/10.1145/3489048.3530970","url":null,"abstract":"LSM-tree-based key-value stores like RocksDB are widely used to support many applications. However, configuring a RocksDB instance is challenging for the following reasons: 1) RocksDB has a massive parameter space to configure; 2) there are inherent trade-offs and dependencies between parameters; 3) optimal configurations are dependent on workload and hardware; and 4) evaluating configurations is time-consuming. Prior works struggle with handling the curse of dimensionality, capturing relationships between parameters, adapting configurations to workload and hardware, and evaluating quickly. We present a system, Dremel, to adaptively and quickly configure RocksDB with strategies based on the Multi-Armed Bandit model. To handle the large parameter space, we propose using fused features, which encode domain-specific knowledge, to work as a compact and powerful representation for configurations. To adapt to the workload and hardware, we build an online bandit model to identify the best configuration. To evaluate quickly, we enable multi-fidelity evaluation and upper-confidence-bound sampling to speed up configuration search. Dremel not only achieves up to ×2.61 higher IOPS and 57% less latency than default configurations but also achieves up to 63% improvement over prior works on 18 different settings with the same or smaller time budget. This paper is an abridged version.","PeriodicalId":264598,"journal":{"name":"Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128644113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Automatic Inference of BGP Location Communities BGP位置团体自动推断

Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems Pub Date : 2022-06-06 DOI: 10.1145/3489048.3522643

B. A. D. Silva, Paulo Mol, O. Fonseca, Ítalo F. S. Cunha, R. Ferreira, Ethan Katz-Bassett

引用次数: 1

Traffic Refinery: Cost-Aware Data Representation for Machine Learning on Network Traffic 流量精炼:网络流量机器学习的成本感知数据表示

Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems Pub Date : 2022-06-06 DOI: 10.1145/3489048.3522637

F. Bronzino, Paul Schmitt, Sara Ayoubi, Hyojoon Kim, Renata Teixeira, N. Feamster

{"title":"Traffic Refinery: Cost-Aware Data Representation for Machine Learning on Network Traffic","authors":"F. Bronzino, Paul Schmitt, Sara Ayoubi, Hyojoon Kim, Renata Teixeira, N. Feamster","doi":"10.1145/3489048.3522637","DOIUrl":"https://doi.org/10.1145/3489048.3522637","url":null,"abstract":"Network management often relies on machine learning to make predictions about performance and security from network traffic. Often, the representation of the traffic is as important as the choice of the model. The features that the model relies on, and the representation of those features, ultimately determine model accuracy, as well as where and whether the model can be deployed in practice. Thus, the design and evaluation of these models ultimately requires understanding not only model accuracy but also the systems costs associated with deploying the model in an operational network. Towards this goal, this paper develops a new framework and system that enables a joint evaluation of both the conventional notions of machine learning performance (model accuracy) and the systems-level costs of different representations of network traffic. We highlight these two dimensions for two practical network management tasks, video streaming quality inference and malware detection, to demonstrate the importance of exploring different representations to find the appropriate operating point. We demonstrate the benefit of exploring a range of representations of network traffic and present Traffic Refinery, a proof-of-concept implementation that both monitors network traffic at 10~Gbps and transforms traffic in real time to produce a variety of feature representations for machine learning. Traffic Refinery both highlights this design space and makes it possible to explore different representations for learning, balancing systems costs related to feature extraction and model training against model accuracy.","PeriodicalId":264598,"journal":{"name":"Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114054618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Dissecting Cloud Gaming Performance with DECAF 用DECAF剖析云游戏性能

Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems Pub Date : 2022-06-06 DOI: 10.1145/3489048.3522628

Hassan Iqbal, A. Khalid, Muhammad Shahzad

引用次数: 0

Prediction of the Resource Consumption of Distributed Deep Learning Systems 分布式深度学习系统的资源消耗预测

Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems Pub Date : 2022-06-06 DOI: 10.1145/3489048.3530962

Gyeongsik Yang, C. Shin, J. Lee, Yeonho Yoo, C. Yoo

{"title":"Prediction of the Resource Consumption of Distributed Deep Learning Systems","authors":"Gyeongsik Yang, C. Shin, J. Lee, Yeonho Yoo, C. Yoo","doi":"10.1145/3489048.3530962","DOIUrl":"https://doi.org/10.1145/3489048.3530962","url":null,"abstract":"Predicting resource consumption for the distributed training of deep learning models is of paramount importance, as it can inform a priori users of how long their training would take and enable users to manage the cost of training. Yet, no such prediction is available for users because the resource consumption itself varies significantly according to \"settings\" such as GPU types and also by \"workloads\" like deep learning models. Previous studies have attempted to derive or model such a prediction, but they fall short of accommodating the various combinations of settings and workloads together. This study presents Driple, which designs graph neural networks to predict the resource consumption of diverse workloads. Driple also designs transfer learning to extend the graph neural networks to adapt to differences in settings. The evaluation results show that Driple effectively predicts a wide range of workloads and settings. In addition, Driple can efficiently reduce the time required to tailor the prediction for different settings by up to 7.3×.","PeriodicalId":264598,"journal":{"name":"Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127892624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Data Convection: A GPU-Driven Case Study for Thermal-Aware Data Placement in 3D DRAMs 数据对流:3D dram中热感知数据放置的gpu驱动案例研究

Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems Pub Date : 2022-06-06 DOI: 10.1145/3489048.3522647

Soheil Khadirsharbiyani, Jagadish B. Kotra, Karthik Rao, M. Kandemir

{"title":"Data Convection: A GPU-Driven Case Study for Thermal-Aware Data Placement in 3D DRAMs","authors":"Soheil Khadirsharbiyani, Jagadish B. Kotra, Karthik Rao, M. Kandemir","doi":"10.1145/3489048.3522647","DOIUrl":"https://doi.org/10.1145/3489048.3522647","url":null,"abstract":"Stacked DRAMs have been studied and productized in the last decade. The large available bandwidth they offer makes them an attractive choice, particularly, in high-performance computing (HPC) environments. Consequently, many prior research efforts have studied and evaluated 3D stacked DRAM-based designs. Despite offering high bandwidth, stacked DRAMs are severely constrained by the overall memory capacity offered. In this paper, we study and evaluate integrating stacked DRAM on top of a GPU in a 3D manner which in tandem with the 2.5D stacked DRAM boosts the capacity and the bandwidth without increasing the package size. It also helps meet the capacity needs of emergent workloads like deep learning. However, the bandwidth given by these 3D stacked DRAMs is significantly constrained by the GPU's heat production. Our investigations on a cycle-level simulator show that the 3D stacked DRAM portions closest to the GPU have shorter retention times than the layers further away. Depending on the retention period, certain regions of 3D stacked DRAM are refreshed more frequently than others, leading to thermally-induced NUMA paradigms. Our proposed approach attempts to place the most frequently requested data in a thermally conscious manner, taking into consideration both bank-level parallelism and channel-level parallelism. The results collected with a cycle-level GPU simulator indicate that the three implementations of our proposed approach lead to 1.8%, 11.7%, and 14.4% performance improvements, over a baseline that already includes 3D+2.5D stacked DRAMs.","PeriodicalId":264598,"journal":{"name":"Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131966797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Metamorphic Testing of Deep Learning Compilers 深度学习编译器的变形测试

Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems Pub Date : 2022-06-06 DOI: 10.1145/3489048.3522655

Dongwei Xiao, Zhibo Liu, Yuanyuan Yuan, Qi Pang, Shuai Wang

{"title":"Metamorphic Testing of Deep Learning Compilers","authors":"Dongwei Xiao, Zhibo Liu, Yuanyuan Yuan, Qi Pang, Shuai Wang","doi":"10.1145/3489048.3522655","DOIUrl":"https://doi.org/10.1145/3489048.3522655","url":null,"abstract":"The prosperous trend of deploying deep neural network (DNN) models to diverse hardware platforms has boosted the development of deep learning (DL) compilers. DL compilers take high-level DNN model specifications as input and generate optimized DNN executables for diverse hardware architectures like CPUs, GPUs, and hardware accelerators. We introduce MT-DLComp, a metamorphic testing framework specifically designed for DL compilers to uncover erroneous compilations. Our approach leverages deliberately-designed metamorphic relations (MRs) to launch semantics-preserving mutations toward DNN models to generate their variants. This way, DL compilers can be automatically tested for compilation correctness by comparing the execution outputs of the compiled DNN models and their variants without manual intervention. We detected over 435 inputs that can result in erroneous compilations in four popular DL compilers, all of which are industry-strength products maintained by Amazon, Facebook, Microsoft, and Google. We uncovered four bugs in these compilers by debugging them using the error-triggering inputs.","PeriodicalId":264598,"journal":{"name":"Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems","volume":"760 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132549249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2