A. Zuck, Philipp Gühring, Zhang Tao, Donald E. Porter, Dan Tsafrir
{"title":"Why and How to Increase SSD Performance Transparency","authors":"A. Zuck, Philipp Gühring, Zhang Tao, Donald E. Porter, Dan Tsafrir","doi":"10.1145/3317550.3321430","DOIUrl":"https://doi.org/10.1145/3317550.3321430","url":null,"abstract":"Even on modern SSDs, I/O scheduling is a first-order performance concern. However, it is unclear how best to optimize I/O patterns for SSDs, because a complex layer of proprietary firmware hides many principal aspects of performance, as well as SSD lifetime. Losing this information leads to research papers drawing incorrect conclusions about prototype systems, as well as real-world systems realizing sub-optimal performance and lifetime. It is our position that a useful performance model of a foundational system component is essential, and the community should support efforts to construct models of SSD performance. We show examples from the literature and our own measurements that illustrate serious limitations of current SSD modeling tools and disk statistics. We observe an opportunity to resolve this problem by reverse engineering SSDs, leveraging recent trends toward component standardization within SSDs. This paper presents a feasibility study and initial results to reverse engineer a commercial SSD's firmware, and discusses limitations and open problems.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"17 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134311854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Calciu, Ivan Puddu, Aasheesh Kolli, A. Nowatzyk, Jayneel Gandhi, O. Mutlu, Pratap Subrahmanyam
{"title":"Project PBerry: FPGA Acceleration for Remote Memory","authors":"I. Calciu, Ivan Puddu, Aasheesh Kolli, A. Nowatzyk, Jayneel Gandhi, O. Mutlu, Pratap Subrahmanyam","doi":"10.1145/3317550.3321424","DOIUrl":"https://doi.org/10.1145/3317550.3321424","url":null,"abstract":"Recent research efforts propose remote memory systems that pool memory from multiple hosts. These systems rely on the virtual memory subsystem to track application memory accesses and transparently offer remote memory to applications. We outline several limitations of this approach, such as page fault overheads and dirty data amplification. Instead, we argue for a fundamentally different approach: leverage the local host's cache coherence traffic to track application memory accesses at cache line granularity. Our approach uses emerging cache-coherent FPGAs to expose cache coherence events to the operating system. This approach not only accelerates remote memory systems by reducing dirty data amplification and by eliminating page faults, but also enables other use cases, such as live virtual machine migration, unified virtual memory, security and code analysis. All of these use cases open up many promising research directions.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123866392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Enis Ceyhun Alp, Eleftherios Kokoris-Kogias, Georgia Fragkouli, B. Ford
{"title":"Rethinking General-Purpose Decentralized Computing","authors":"Enis Ceyhun Alp, Eleftherios Kokoris-Kogias, Georgia Fragkouli, B. Ford","doi":"10.1145/3317550.3321448","DOIUrl":"https://doi.org/10.1145/3317550.3321448","url":null,"abstract":"While showing great promise, smart contracts are difficult to program correctly, as they need a deep understanding of cryptography and distributed algorithms, and offer limited functionality, as they have to be deterministic and cannot operate on secret data. In this paper we present Protean, a general-purpose decentralized computing platform that addresses these limitations by moving from a monolithic execution model, where all participating nodes store all the state and execute every computation, to a modular execution-model. Protean employs secure specialized modules, called functional units, for building decentralized applications that are currently insecure or impossible to implement with smart contracts. Each functional unit is a distributed system that provides a special-purpose functionality by exposing atomic transactions to the smart-contract developer. Combining these transactions into arbitrarily-defined workflows, developers can build a larger class of decentralized applications, such as provably-secure and fair lotteries or e-voting.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115408594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"In-Network Compute: Considered Armed and Dangerous","authors":"Theophilus A. Benson","doi":"10.1145/3317550.3321436","DOIUrl":"https://doi.org/10.1145/3317550.3321436","url":null,"abstract":"Programmable data planes promise unprecedented flexibility and innovation. But enormous management issues arise when these programmable data-planes, and the in-network compute functionality they enable, are deployed within production networks. In this paper, we present an overview of these management challenges, then explore the limitations of existing management techniques. Finally, we propose a system, Harmony, that encapsulates new abstractions and primitives to address these problems.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121627262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CPR for SSDs","authors":"B. Kim, Eunji Lee, Sungjin Lee, S. Min","doi":"10.1145/3317550.3321437","DOIUrl":"https://doi.org/10.1145/3317550.3321437","url":null,"abstract":"Modern storage systems are built upon the assumption that the capacity of a storage device does not change. This capacity-invariant interface forces a flash-based storage device to trade performance for reliability when, in fact, it can maintain both if a graceful reduction in capacity were to be allowed. We argue that relaxing the fixed capacity abstraction of the storage device allows for a better capacity-performance-reliability (CPR) tradeoff. We then outline existing device-internal mechanisms for building a capacity-variant flash device, and describe the necessary changes in the storage stack.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"79 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115039574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Hunt, Zhipeng Jia, Vance Miller, C. Rossbach, E. Witchel
{"title":"Isolation and Beyond: Challenges for System Security","authors":"T. Hunt, Zhipeng Jia, Vance Miller, C. Rossbach, E. Witchel","doi":"10.1145/3317550.3321427","DOIUrl":"https://doi.org/10.1145/3317550.3321427","url":null,"abstract":"System security has historically relied on hardware-provided isolation primitives. However, Meltdown [36] and Spectre [30] demonstrate that basic user/kernel isolation could be bypassed in every widely deployed ISA for decades; they are a caution to system designers who accept hardware isolation guarantees as an article of faith. Hardware isolation is fallible and should be considered fallible by software systems. We argue that future systems should broaden their view to adopt techniques that compensate for weaknesses in hardware isolation and should secure and optimize the communication among isolated components. Changing algorithms to be data oblivious, so that their externally observable behavior is independent of their (secret) input data is one such technique. Securing communication requires that the timing and size of messages be independent of secret data, but how best to achieve that independence so as to limit performance and energy overheads will vary from application to application.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125357117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hangchen Yu, A. Peters, Amogh Akshintala, C. Rossbach
{"title":"Automatic Virtualization of Accelerators","authors":"Hangchen Yu, A. Peters, Amogh Akshintala, C. Rossbach","doi":"10.1145/3317550.3321423","DOIUrl":"https://doi.org/10.1145/3317550.3321423","url":null,"abstract":"Applications are migrating en masse to the cloud, while accelerators such as GPUs, TPUs, and FPGAs proliferate in the wake of Moore's Law. These technological trends are incompatible. Cloud applications run on virtual platforms, but traditional I/O virtualization techniques have not provided production-ready solutions for accelerators. As a result, cloud providers expose accelerators by using pass-through techniques which dedicate physical devices to individual guests. The multi-tenancy that drives their business is lost as a consequence. This paper proposes automatic generation of virtual accelerator stacks to address the fundamental tradeoffs between virtualization properties and techniques for accelerators. AvA (Automatic Virtualization of Accelerators) re-purposes a para-virtual I/O stack design based on API remoting to present virtual accelerator APIs to guest VMs. Conventional wisdom is that API remoting sacrifices interposition and compatibility. AvA forwards invocations over hypervisor-managed transport to recover interposition. AvA compensates for lost compatibility by automatically generating guest libraries, drivers, hypervisor-level schedulers, and API servers. AvA supports pluggable transport layers, allowing VMs to use disaggregated accelerators. With AvA, a single developer could virtualize a core subset of OpenCL at near-native performance in just a few days.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124313211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"When Should The Network Be The Computer?","authors":"Dan R. K. Ports, J. Nelson","doi":"10.1145/3317550.3321439","DOIUrl":"https://doi.org/10.1145/3317550.3321439","url":null,"abstract":"Researchers have repurposed programmable network devices to place small amounts of application computation in the network, sometimes yielding orders-of-magnitude performance gains. At the same time, effectively using these devices requires careful use of limited resources and managing deployment challenges. This paper provides a framework for principled use of in-network processing. We provide a set of guidelines for building robust and deployable in-network primives, along with a taxonomy to help identify which applications can benefit from in-network processing and what types of devices they should use.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115641892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nines are Not Enough: Meaningful Metrics for Clouds","authors":"J. Mogul, J. Wilkes","doi":"10.1145/3317550.3321432","DOIUrl":"https://doi.org/10.1145/3317550.3321432","url":null,"abstract":"Cloud customers want strong, understandable promises (Service Level Objectives, or SLOs) that their applications will run reliably and with adequate performance, but cloud providers don't want to offer them, because they are technically hard to meet in the face of arbitrary customer behavior and the hidden interactions brought about by statistical multiplexing of shared resources. Existing cloud SLOs are more concerned with defending against corner cases than defining normal behavior. This and other tensions make SLOs surprisingly hard to define. We show that this problem shares some similarities with the challenges of applying statistics to make decisions based on sampled data. We argue that a mutually beneficial set of Service Level Expectations (SLEs) and Customer Behavior Expectations (CBEs) ameliorates many of the problems of today's SLOs by explicitly sharing risk between customer and service provider.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124935303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}