Proceedings of the Seventeenth European Conference on Computer Systems最新文献_第5页

OS scheduling with nest: keeping tasks close together on warm cores 操作系统调度与巢:保持任务靠近在温暖的核心

Proceedings of the Seventeenth European Conference on Computer Systems Pub Date : 2022-03-28 DOI: 10.1145/3492321.3519585

J. Lawall, Himadri Chhaya-Shailesh, Jean-Pierre Lozi, Baptiste Lepers, W. Zwaenepoel, Gilles Muller

引用次数: 5

Unicorn: reasoning about configurable system performance through the lens of causality 独角兽:通过因果关系的视角来推理可配置系统的性能

Proceedings of the Seventeenth European Conference on Computer Systems Pub Date : 2022-01-20 DOI: 10.1145/3492321.3519575

Md Shahriar Iqbal, R. Krishna, Mohammad Ali Javidian, Baishakhi Ray, Pooyan Jamshidi

{"title":"Unicorn: reasoning about configurable system performance through the lens of causality","authors":"Md Shahriar Iqbal, R. Krishna, Mohammad Ali Javidian, Baishakhi Ray, Pooyan Jamshidi","doi":"10.1145/3492321.3519575","DOIUrl":"https://doi.org/10.1145/3492321.3519575","url":null,"abstract":"Modern computer systems are highly configurable, with the total variability space sometimes larger than the number of atoms in the universe. Understanding and reasoning about the performance behavior of highly configurable systems, over a vast and variable space, is challenging. State-of-the-art methods for performance modeling and analyses rely on predictive machine learning models, therefore, they become (i) unreliable in unseen environments (e.g., different hardware, workloads), and (ii) may produce incorrect explanations. To tackle this, we propose a new method, called Unicorn, which (i) captures intricate interactions between configuration options across the software-hardware stack and (ii) describes how such interactions can impact performance variations via causal inference. We evaluated Unicorn on six highly configurable systems, including three on-device machine learning systems, a video encoder, a database management system, and a data analytics pipeline. The experimental results indicate that Unicorn outperforms state-of-the-art performance debugging and optimization methods in finding effective repairs for performance faults and finding configurations with near-optimal performance. Further, unlike the existing methods, the learned causal performance models reliably predict performance for new environments.","PeriodicalId":196414,"journal":{"name":"Proceedings of the Seventeenth European Conference on Computer Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131228218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Varuna: scalable, low-cost training of massive deep learning models Varuna:大规模深度学习模型的可扩展、低成本训练

Proceedings of the Seventeenth European Conference on Computer Systems Pub Date : 2021-11-07 DOI: 10.1145/3492321.3519584

Sanjith Athlur, Nitika Saran, Muthian Sivathanu, R. Ramjee, Nipun Kwatra

{"title":"Varuna: scalable, low-cost training of massive deep learning models","authors":"Sanjith Athlur, Nitika Saran, Muthian Sivathanu, R. Ramjee, Nipun Kwatra","doi":"10.1145/3492321.3519584","DOIUrl":"https://doi.org/10.1145/3492321.3519584","url":null,"abstract":"Systems for training massive deep learning models (billions of parameters) today assume and require specialized \"hyperclusters\": hundreds or thousands of GPUs wired with specialized high-bandwidth interconnects such as NV-Link and Infiniband. Besides being expensive, such dependence on hyperclusters and custom high-speed inter-connects limits the size of such clusters, creating (a) scalability limits on job parallelism; (b) resource fragmentation across hyperclusters. In this paper, we present Varuna a new system that enables training massive deep learning models on commodity networking. Varuna makes thrifty use of networking resources and automatically configures the user's training job to efficiently use any given set of resources. Therefore, Varuna is able to leverage \"low-priority\" VMs that cost about 5x cheaper than dedicated GPUs, thus significantly reducing the cost of training massive models. We demonstrate the efficacy of Varuna by training massive models, including a 200 billion parameter model, on 5x cheaper \"spot VMs\", while maintaining high training throughput. Varuna improves end-to-end training time for language models like BERT and GPT-2 by up to 18x compared to other model-parallel approaches and up to 26% compared to other pipeline parallel approaches on commodity VMs. The code for Varuna is available at https://github.com/microsoft/varuna.","PeriodicalId":196414,"journal":{"name":"Proceedings of the Seventeenth European Conference on Computer Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125395328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Nyx-net: network fuzzing with incremental snapshots 使用增量快照的网络模糊测试

Proceedings of the Seventeenth European Conference on Computer Systems Pub Date : 2021-11-04 DOI: 10.1145/3492321.3519591

Sergej Schumilo, Cornelius Aschermann, Andrea Jemmett, A. Abbasi, Thorsten Holz

{"title":"Nyx-net: network fuzzing with incremental snapshots","authors":"Sergej Schumilo, Cornelius Aschermann, Andrea Jemmett, A. Abbasi, Thorsten Holz","doi":"10.1145/3492321.3519591","DOIUrl":"https://doi.org/10.1145/3492321.3519591","url":null,"abstract":"Coverage-guided fuzz testing (\"fuzzing\") has become mainstream and we have observed lots of progress in this research area recently. However, it is still challenging to efficiently test network services with existing coverage-guided fuzzing methods. In this paper, we introduce the design and implementation of Nyx-Net, a novel snapshot-based fuzzing approach that can successfully fuzz a wide range of targets spanning servers, clients, games, and even Firefox's Inter-Process Communication (IPC) interface. Compared to state-of-the-art methods, Nyx-Net improves test throughput by up to 300x and coverage found by up to 70%. Additionally, Nyx-Net is able to find crashes in two of ProFuzzBench's targets that no other fuzzer found previously. When using Nyx-Net to play the game Super Mario, Nyx-Net shows speedups of 10--30x compared to existing work. Moreover, Nyx-Net is able to find previously unknown bugs in servers such as Lighttpd, clients such as MySQL client, and even Firefox's IPC mechanism---demonstrating the strength and versatility of the proposed approach. Lastly, our prototype implementation was awarded a $20.000 bug bounty for enabling fuzzing on previously unfuzzable code in Firefox and solving a long-standing problem at Mozilla.","PeriodicalId":196414,"journal":{"name":"Proceedings of the Seventeenth European Conference on Computer Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126288797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Minimum viable device drivers for ARM trustzone ARM trustzone的最小可行设备驱动程序

Proceedings of the Seventeenth European Conference on Computer Systems Pub Date : 2021-10-15 DOI: 10.1145/3492321.3519565

Liwei Guo, F. Lin

引用次数: 7

Multi-objective congestion control 多目标拥塞控制

Proceedings of the Seventeenth European Conference on Computer Systems Pub Date : 2021-07-03 DOI: 10.1145/3492321.3519593

Yiqing Ma, Han Tian, Xudong Liao, Junxue Zhang, Weiyan Wang, Kai Chen, Xin Jin

{"title":"Multi-objective congestion control","authors":"Yiqing Ma, Han Tian, Xudong Liao, Junxue Zhang, Weiyan Wang, Kai Chen, Xin Jin","doi":"10.1145/3492321.3519593","DOIUrl":"https://doi.org/10.1145/3492321.3519593","url":null,"abstract":"Decades of research on Internet congestion control (CC) have produced a plethora of algorithms that optimize for different performance objectives. Applications face the challenge of choosing the most suitable algorithm based on their needs, and it takes tremendous efforts and expertise to customize CC algorithms when new demands emerge. In this paper, we explore a basic question: can we design a single CC algorithm to satisfy different objectives? We propose MOCC, the first multi-objective congestion control algorithm that attempts to address this question. The core of MOCC is a novel multi-objective reinforcement learning framework for CC to automatically learn the correlations between different application requirements and the corresponding optimal control policies. Under this framework, MOCC further applies transfer learning to transfer the knowledge from past experience to new applications, quickly adapting itself to a new objective even if it is unforeseen. We provide both user-space and kernel-space implementation of MOCC. Real-world Internet experiments and extensive simulations show that MOCC supports well multi-objective, competing or outperforming the best existing CC algorithms on each individual objectives, and quickly adapting to new application objectives in 288 seconds (14.2× faster than prior work) without compromising old ones.","PeriodicalId":196414,"journal":{"name":"Proceedings of the Seventeenth European Conference on Computer Systems","volume":"226 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132567427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Narwhal and Tusk: a DAG-based mempool and efficient BFT consensus 独角鲸和图斯克:基于dag的内存池和高效BFT共识

Proceedings of the Seventeenth European Conference on Computer Systems Pub Date : 2021-05-25 DOI: 10.1145/3492321.3519594

G. Danezis, Eleftherios Kokoris-Kogias, A. Sonnino, A. Spiegelman

{"title":"Narwhal and Tusk: a DAG-based mempool and efficient BFT consensus","authors":"G. Danezis, Eleftherios Kokoris-Kogias, A. Sonnino, A. Spiegelman","doi":"10.1145/3492321.3519594","DOIUrl":"https://doi.org/10.1145/3492321.3519594","url":null,"abstract":"We propose separating the task of reliable transaction dissemination from transaction ordering, to enable high-performance Byzantine fault-tolerant quorum-based consensus. We design and evaluate a mempool protocol, Narwhal, specializing in high-throughput reliable dissemination and storage of causal histories of transactions. Narwhal tolerates an asynchronous network and maintains high performance despite failures. Narwhal is designed to easily scale-out using multiple workers at each validator, and we demonstrate that there is no foreseeable limit to the throughput we can achieve. Composing Narwhal with a partially synchronous consensus protocol (Narwhal-HotStuff) yields significantly better throughput even in the presence of faults or intermittent loss of liveness due to asynchrony. However, loss of liveness can result in higher latency. To achieve overall good performance when faults occur we design Tusk, a zero-message overhead asynchronous consensus protocol, to work with Narwhal. We demonstrate its high performance under a variety of configurations and faults. As a summary of results, on a WAN, Narwhal-Hotstuff achieves over 130,000 tx/sec at less than 2-sec latency compared with 1,800 tx/sec at 1-sec latency for Hotstuff. Additional workers increase throughput linearly to 600,000 tx/sec without any latency increase. Tusk achieves 160,000 tx/sec with about 3 seconds latency. Under faults, both protocols maintain high throughput, but Narwhal-HotStuff suffers from increased latency.","PeriodicalId":196414,"journal":{"name":"Proceedings of the Seventeenth European Conference on Computer Systems","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116488543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 127

Isolating functions at the hardware limit with virtines 用virtines隔离硬件限制下的功能

Proceedings of the Seventeenth European Conference on Computer Systems Pub Date : 2021-04-14 DOI: 10.1145/3492321.3519553

Nicholas C. Wanninger, Josh Bowden, K. Shetty, Ayush Garg, Kyle C. Hale

{"title":"Isolating functions at the hardware limit with virtines","authors":"Nicholas C. Wanninger, Josh Bowden, K. Shetty, Ayush Garg, Kyle C. Hale","doi":"10.1145/3492321.3519553","DOIUrl":"https://doi.org/10.1145/3492321.3519553","url":null,"abstract":"An important class of applications, including programs that leverage third-party libraries, programs that use user-defined functions in databases, and serverless applications, benefit from isolating the execution of untrusted code at the granularity of individual functions or function invocations. However, existing isolation mechanisms were not designed for this use case; rather, they have been adapted to it. We introduce virtines, a new abstraction designed specifically for function granularity isolation, and describe how we build virtines from the ground up by pushing hardware virtualization to its limits. Virtines give developers fine-grained control in deciding which functions should run in isolated environments, and which should not. The virtine abstraction is a general one, and we demonstrate a prototype that adds extensions to the C language. We present a detailed analysis of the overheads of running individual functions in isolated VMs, and guided by those findings, we present Wasp, an embeddable hypervisor that allows programmers to easily use virtines. We describe several representative scenarios that employ individual function isolation, and demonstrate that virtines can be applied in these scenarios with only a few lines of changes to existing codebases and with acceptable slowdowns.","PeriodicalId":196414,"journal":{"name":"Proceedings of the Seventeenth European Conference on Computer Systems","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128452907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9