{"title":"Unix shell programming: the next 50 years","authors":"M. Greenberg, Konstantinos Kallas, N. Vasilakis","doi":"10.1145/3458336.3465294","DOIUrl":"https://doi.org/10.1145/3458336.3465294","url":null,"abstract":"The Unix shell is a powerful, ubiquitous, and reviled tool for managing computer systems. The shell has been largely ignored by academia and industry. While many replacement shells have been proposed, the Unix shell persists. Two recent threads of formal and practical research on the shell enable new approaches. We can help manage the shell's essential shortcomings (dynamism, power, and abstruseness) and address its inessential ones. Improving the shell holds much promise for development, ops, and data processing.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123983182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Hochschild, Paul Turner, J. Mogul, R. Govindaraju, Parthasarathy Ranganathan, D. Culler, A. Vahdat
{"title":"Cores that don't count","authors":"P. Hochschild, Paul Turner, J. Mogul, R. Govindaraju, Parthasarathy Ranganathan, D. Culler, A. Vahdat","doi":"10.1145/3458336.3465297","DOIUrl":"https://doi.org/10.1145/3458336.3465297","url":null,"abstract":"We are accustomed to thinking of computers as fail-stop, especially the cores that execute instructions, and most system software implicitly relies on that assumption. During most of the VLSI era, processors that passed manufacturing tests and were operated within specifications have insulated us from this fiction. As fabrication pushes towards smaller feature sizes and more elaborate computational structures, and as increasingly specialized instruction-silicon pairings are introduced to improve performance, we have observed ephemeral computational errors that were not detected during manufacturing tests. These defects cannot always be mitigated by techniques such as microcode updates, and may be correlated to specific components within the processor, allowing small code changes to effect large shifts in reliability. Worse, these failures are often \"silent\" - the only symptom is an erroneous computation. We refer to a core that develops such behavior as \"mercurial.\" Mercurial cores are extremely rare, but in a large fleet of servers we can observe the disruption they cause, often enough to see them as a distinct problem - one that will require collaboration between hardware designers, processor vendors, and systems software architects. This paper is a call-to-action for a new focus in systems research; we speculate about several software-based approaches to mercurial cores, ranging from better detection and isolating mechanisms, to methods for tolerating the silent data corruption they cause.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122472255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiming Qiu, Hongyi Liu, T. Anderson, Yingyan Lin, Ang Chen
{"title":"Toward reconfigurable kernel datapaths with learned optimizations","authors":"Yiming Qiu, Hongyi Liu, T. Anderson, Yingyan Lin, Ang Chen","doi":"10.1145/3458336.3465288","DOIUrl":"https://doi.org/10.1145/3458336.3465288","url":null,"abstract":"Today's computing systems pay a heavy \"OS tax\", as kernel execution accounts for a significant amount of resource footprint. This is not least because today's kernels abound with hardcoded heuristics that are designed with unstated assumptions, which rarely generalize well for diversifying applications and device technologies. We propose the concept of reconfigurable kernel datapaths that enables kernels to self-optimize dynamically. In this architecture, optimizations are computed from empirical data using machine learning (ML), and they are integrated into the kernel in a safe and systematic manner via an in-kernel virtual machine. This virtual machine implements the reconfigurable match table (RMT) abstraction, where tables are installed into the kernel at points where performance-critical events occur, matches look up the current execution context, and actions encode context-specific optimizations computed by ML, which may further vary from application to application. Our envisioned architecture will support both offline and online learning algorithms, as well as varied kernel subsystems. An RMT verifier will check program well-formedness and model efficiency before admitting an RMT program to the kernel. An admitted program can be interpreted in bytecode or just-in-time compiled to optimize the kernel datapaths.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125697395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Y. Wu, Hongyi Wang, Yuhong Zhong, Asaf Cidon, Ryan Stutsman, Amy Tai, Junfeng Yang
{"title":"BPF for storage: an exokernel-inspired approach","authors":"Y. Wu, Hongyi Wang, Yuhong Zhong, Asaf Cidon, Ryan Stutsman, Amy Tai, Junfeng Yang","doi":"10.1145/3458336.3465290","DOIUrl":"https://doi.org/10.1145/3458336.3465290","url":null,"abstract":"The overhead of the kernel storage path accounts for half of the access latency for new NVMe storage devices. We explore using BPF to reduce this overhead, by injecting user-defined functions deep in the kernel's I/O processing stack. When issuing a series of dependent I/O requests, this approach can increase IOPS by over 2.5X and cut latency by half, by bypassing kernel layers and avoiding user-kernel boundary crossings. However, we must avoid losing important properties when bypassing the file system and block layer such as the safety guarantees of the file system and translation between physical blocks addresses and file offsets. We sketch potential solutions to these problems, inspired by exokernel file systems from the late 90s, whose time, we believe, has finally come! \"As a dog returns to his vomit, so a fool repeats his folly.\" Attributed to King Solomon","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115355050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Granular Computing","authors":"Collin Lee, J. Ousterhout","doi":"10.1145/3317550.3321447","DOIUrl":"https://doi.org/10.1145/3317550.3321447","url":null,"abstract":"Granular computing is a new style of computing where applications are composed of large numbers (thousands to millions) of very short-lived (10-100μs) tasks. Today's systems and infrastructure were designed to support millisecond-scale operations and are inadequate to meet the demands of granular computing. In this position paper we discuss the challenges of supporting granular applications, such as handling extreme bursts of activity, and we present a few initial ideas about the infrastructure required to enable granular computing, such as new mechanisms for communication and persistence.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116953550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tian Yang, Robert Gifford, Andreas Haeberlen, L. T. Phan
{"title":"The Synchronous Data Center","authors":"Tian Yang, Robert Gifford, Andreas Haeberlen, L. T. Phan","doi":"10.1145/3317550.3321442","DOIUrl":"https://doi.org/10.1145/3317550.3321442","url":null,"abstract":"Today, distributed systems are typically designed to be largely asynchronous. Designers assume that the network can drop or significantly delay messages at unpredictable times, that there is no way to know how quickly a node might process a message, or how soon it might respond, and that the clocks of different nodes are at most loosely synchronized. These assumptions are certainly safe, but they come at a price: many applications really do need predictable performance, which, on top of an asynchronous system, has to be approximated at great cost and with lots of redundancy, and many distributed protocols for asynchronous systems are much more complex and expensive than their synchronous counterparts. The goal of this paper is to start a discussion about whether the asynchronous model is (still) the right choice for most distributed systems, especially ones that belong to a single administrative domain. We argue that 1) by using ideas from the CPS domain, it is technically feasible to build datacenter-scale systems that are fully synchronous, and that 2) such systems would have several interesting advantages over current designs.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125515041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vikram Narayanan, Marek S. Baranowski, L. Ryzhyk, Zvonimir Rakamaric, A. Burtsev
{"title":"RedLeaf","authors":"Vikram Narayanan, Marek S. Baranowski, L. Ryzhyk, Zvonimir Rakamaric, A. Burtsev","doi":"10.1145/3317550.3321449","DOIUrl":"https://doi.org/10.1145/3317550.3321449","url":null,"abstract":"RedLeaf is a new operating system being developed from scratch to utilize formal verification for implementing provably secure firmware. RedLeaf is developed in a safe language, Rust, and relies on automated reasoning using satisfiability modulo theories (SMT) solvers for formal verification. RedLeaf builds on two premises: (1) Rust's linear type system enables practical language safety even for systems with tightest performance and resource budgets (e.g., firmware), and (2) a combination of SMT-based reasoning and pointer discipline enforced by linear types provides a unique way to automate and simplify verification effort scaling it to the size of a small OS kernel.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117047597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrew Baumann, J. Appavoo, O. Krieger, Timothy Roscoe
{"title":"A fork() in the road","authors":"Andrew Baumann, J. Appavoo, O. Krieger, Timothy Roscoe","doi":"10.1145/3317550.3321435","DOIUrl":"https://doi.org/10.1145/3317550.3321435","url":null,"abstract":"The received wisdom suggests that Unix's unusual combination of fork() and exec() for process creation was an inspired design. In this paper, we argue that fork was a clever hack for machines and programs of the 1970s that has long outlived its usefulness and is now a liability. We catalog the ways in which fork is a terrible abstraction for the modern programmer to use, describe how it compromises OS implementations, and propose alternatives. As the designers and implementers of operating systems, we should acknowledge that fork's continued existence as a first-class OS primitive holds back systems research, and deprecate it. As educators, we should teach fork as a historical artifact, and not the first process creation mechanism students encounter.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131713898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"What bugs cause production cloud incidents?","authors":"Haopeng Liu, Shan Lu, M. Musuvathi, Suman Nath","doi":"10.1145/3317550.3321438","DOIUrl":"https://doi.org/10.1145/3317550.3321438","url":null,"abstract":"Cloud services have become the backbone of today's computing world. Runtime incidents, which adversely affect the expected service operations, are extremely costly in terms of user impacts and engineering efforts required to resolve them. Hence, such incidents are the target of much research effort. Unfortunately, there is limited understanding about cloud service incidents that actually happen during production runs: what cause them and how they are resolved. In this work, we carefully study hundreds of high-severity incidents that occurred recently during the production runs of many Microsoft Azure services. We find software bugs to be a major cause behind these incidents, and make interesting observations about the types of software bugs that cause cloud incidents and how these bug-related incidents are resolved, providing motivation and guidance to future research in tackling cloud bugs and improving the cloud-service availability.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115676525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Irene Zhang, Jing Liu, A. Austin, Michael L. Roberts, Anirudh Badam
{"title":"I'm Not Dead Yet!: The Role of the Operating System in a Kernel-Bypass Era","authors":"Irene Zhang, Jing Liu, A. Austin, Michael L. Roberts, Anirudh Badam","doi":"10.1145/3317550.3321422","DOIUrl":"https://doi.org/10.1145/3317550.3321422","url":null,"abstract":"Researchers have long predicted the demise of the operating system [21, 26, 41]. As datacenter servers increasingly incorporate I/O devices that let applications bypass the OS kernel (e.g., RDMA [12] and DPDK [15] network devices or SPDK storage devices), this prediction may finally come true. While kernel-bypass devices do eliminate the OS kernel from the I/O path, they do not handle the kernel's most important job: offering higher-level abstractions. This paper argues for a new high-level, device-agnostic I/O abstraction for kernel-bypass devices. We propose the Demikernel, a new library OS architecture for kernel-bypass devices. It defines a high-level, kernel-bypass I/O abstraction and provides user-space library OSes to implement that abstraction across a range of kernel-bypass devices. The Demikernel makes applications easier to build, portable across devices, and unmodified as devices continue to evolve.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"236 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134137550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}