{"title":"Century-scale smart infrastructure","authors":"Dhananjay Jagtap, N. Bhaskar, P. Pannuto","doi":"10.1145/3458336.3465275","DOIUrl":"https://doi.org/10.1145/3458336.3465275","url":null,"abstract":"On average, wireless electronics devices are replaced every 50 months. On average, a bridge is replaced every 50 years. As we begin to imagine integrating electronics and intelligence into the built environment, we need to to begin to think about electronic devices and systems on infrastructure timelines. This is not to say that every individual electronic device can, will, or should last for decades, but much like the ship of Theseus, the system that defines emerging Smart Cities will have a lifetime reaching into the century-scale. In this paper, we contemplate what the devices, gateways, network architectures, and their management might look like for a system designed to operate for decades. The result is a mixture of actionable insights for today and research questions for tomorrow, which culminates in the commencement of a 50-year experiment designed to see how long energy-harvesting sensors, without the implicit lifetime of batteries, can remain viable without human attention or intervention.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133301270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Systems research is running out of time","authors":"A. Najafi, Amy Tai, Michael Wei","doi":"10.1145/3458336.3465293","DOIUrl":"https://doi.org/10.1145/3458336.3465293","url":null,"abstract":"Most sciences conduct experiments with a thorough understanding of the accuracy and precision of the instruments used for making measurements. Time is the most frequently used measurement in systems research, yet most of the literature does not consider the precision and accuracy of clocks. In this paper, we argue for the importance of understanding timekeeping and providing precise and accurate time for general systems research.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114883689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The future of the shell: Unix and beyond","authors":"M. Greenberg, Konstantinos Kallas, N. Vasilakis","doi":"10.1145/3458336.3465296","DOIUrl":"https://doi.org/10.1145/3458336.3465296","url":null,"abstract":"The Unix shell is fifty years old, and it continues to be the primary way to configure, deploy, and manage systems of all kinds. What do the next fifty years hold? What is the command-line interface of the 21st century? This 90-minute panel brings together researchers and engineers from disparate communities (systems, languages, security) to think about the shell's strengths and weaknesses, challenges and opportunities around the shell, and the shell's future.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126833589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Privacy heroes need data disguises","authors":"Lillian Tsai, Malte Schwarzkopf, E. Kohler","doi":"10.1145/3458336.3465284","DOIUrl":"https://doi.org/10.1145/3458336.3465284","url":null,"abstract":"Providing privacy in complex, data-rich applications is hard. Deleting accounts, anonymizing an account's contributions, and other privacy-related actions may require the traversal and transformation of interwoven state in a relational database. Finding the affected data is already nontrivial, but privacy actions must additionally balance competing requirements, such as preserving data trails for legal reasons or allowing users to change their mind. We believe a systematic shared framework for specifying and implementing privacy transformations could simplify and empower applications. Our prototype, data disguising, supports fine-grained, nuanced, and useful policies that would be cumbersome to implement manually, including reversible transformations that can compose.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114334137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Humphries, Kostis Kaffes, David Mazières, C. Kozyrakis
{"title":"A case against (most) context switches","authors":"J. Humphries, Kostis Kaffes, David Mazières, C. Kozyrakis","doi":"10.1145/3458336.3465274","DOIUrl":"https://doi.org/10.1145/3458336.3465274","url":null,"abstract":"Multiplexing software threads onto hardware threads and serving interrupts, VM-exits, and system calls require frequent context switches, causing high overheads and significant kernel and application complexity. We argue that context switching is an idea whose time has come and gone, and propose eliminating it through a radically different hardware threading model targeted to solve software rather than hardware problems. The new model adds a large number of hardware threads to each physical core - making thread multiplexing unnecessary - and lets software manage them. The only state change directly triggered in hardware by system calls, exceptions, and asynchronous hardware events will be blocking and unblocking hardware threads. We also present ISA extensions to allow kernel and user software to exploit this new threading model. Developers can use these extensions to eliminate interrupts and implement fast I/O without polling, exception-less system and hypervisor calls, practical microkernels, simple distributed programming models, and untrusted but fast hypervisors. Finally, we suggest practical hardware implementations and discuss the hardware and software challenges toward realizing this novel approach.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125525773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ahmad Ghalayini, Jinkun Geng, Vighnesh Sachidananda, Vinay Sriram, Yilong Geng, B. Prabhakar, M. Rosenblum, Anirudh Sivaraman
{"title":"CloudEx","authors":"Ahmad Ghalayini, Jinkun Geng, Vighnesh Sachidananda, Vinay Sriram, Yilong Geng, B. Prabhakar, M. Rosenblum, Anirudh Sivaraman","doi":"10.1145/3458336.3465278","DOIUrl":"https://doi.org/10.1145/3458336.3465278","url":null,"abstract":"Financial exchanges have begun a move from on-premise and custom-engineered datacenters to the public cloud, accelerated by a rush of new investors, the rise of remote work, cost savings from the cloud, and the desire for more resilient infrastructure. While the promise of the cloud is enticing, the cloud's varying network latencies can lead to market unfairness: orders can be processed out of sequence, and market data can be disseminated to market participants at incorrect times due to varying latencies between participants and the exchange. We present CloudEx, a fair-access cloud exchange, which leverages high-precision software clock synchronization to compensate for noisy network conditions in the public cloud. We also discuss refinements to the CloudEx design that were informed by lessons learned from deploying CloudEx in two academic courses and conclude by outlining future research directions.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116610239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"In reference to RPC: it's time to add distributed memory","authors":"Stephanie Wang, Benjamin Hindman, I. Stoica","doi":"10.1145/3458336.3465302","DOIUrl":"https://doi.org/10.1145/3458336.3465302","url":null,"abstract":"RPC has been remarkably successful. Most distributed applications built today use an RPC runtime such as gRPC [3] or Apache Thrift [2]. The key behind RPC's success is the simple but powerful semantics of its programming model. In particular, RPC has no shared state: arguments and return values are passed by value between processes, meaning that they must be copied into the request or reply. Thus, arguments and return values are inherently immutable. These simple semantics facilitate highly efficient and reliable implementations, as no distributed coordination is required, while remaining useful for a general set of distributed applications. The generality of RPC also enables interoperability: any application that speaks RPC can communicate with another application that understands RPC.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116621465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hugo Lefeuvre, Vlad-Andrei Bădoiu, Stefan Teodorescu, Pierre-Louis Olivier, Tiberiu Mosnoi, Răzvan Deaconescu, Felipe Huici, C. Raiciu
{"title":"FlexOS","authors":"Hugo Lefeuvre, Vlad-Andrei Bădoiu, Stefan Teodorescu, Pierre-Louis Olivier, Tiberiu Mosnoi, Răzvan Deaconescu, Felipe Huici, C. Raiciu","doi":"10.1145/3458336.3465292","DOIUrl":"https://doi.org/10.1145/3458336.3465292","url":null,"abstract":"OS design is traditionally heavily intertwined with protection mechanisms. OSes statically commit to one or a combination of (1) hardware isolation, (2) runtime checking, and (3) software verification early at design time. Changes after deployment require major refactoring; as such, they are rare and costly. In this paper, we argue that this strategy is at odds with recent hardware and software trends: protections break (Meltdown), hardware becomes heterogeneous (Memory Protection Keys, CHERI), and multiple mechanisms can now be used for the same task (software hardening, verification, HW isolation, etc). In short, the choice of isolation strategy and primitives should be postponed to deployment time. We present FlexOS, a novel, modular OS design whose compartmentalization and protection profile can seamlessly be tailored towards a specific application or use-case at build time. FlexOS offers a language to describe components' security needs/behavior, and to automatically derive from it a compartmentalization strategy. We implement an early proto-type of FlexOS that can automatically generate a large array of different OSes implementing different security strategies.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115512914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrew Yoo, Yuanli Wang, Ritesh Sinha, Shuai Mu, Tianyin Xu
{"title":"Fail-slow fault tolerance needs programming support","authors":"Andrew Yoo, Yuanli Wang, Ritesh Sinha, Shuai Mu, Tianyin Xu","doi":"10.1145/3458336.3465299","DOIUrl":"https://doi.org/10.1145/3458336.3465299","url":null,"abstract":"The need for fail-slow fault tolerance in modern distributed systems is highlighted by the increasingly reported fail-slow hardware/software components that lead to poor performance system-wide. We argue that fail-slow fault tolerance not only needs new distributed protocol designs, but also desires programming support for implementing and verifying fail-slow fault-tolerant code. Our observation is that the inability of tolerating fail-slow faults in existing distributed systems is often rooted in the implementations and is difficult to understand and debug. We designed the Dependably Fast Library (DepFast) for implementing fail-slow tolerant distributed systems. DepFast provides expressive interfaces for taking control of possible fail-slow points in the program to prevent unexpected slowness propagation once and for all. We use DepFast to implement a distributed replicated state machine (RSM) and show that it can tolerate various types of fail-slow faults that affect existing RSM implementations.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"142 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115788894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Doing more with less: training large DNN models on commodity servers for the masses","authors":"Youjie Li, Amar Phanishayee, D. Murray, N. Kim","doi":"10.1145/3458336.3465289","DOIUrl":"https://doi.org/10.1145/3458336.3465289","url":null,"abstract":"Deep neural networks (DNNs) have grown exponentially in complexity and size over the past decade, leaving only the elite who have access to massive datacenter-based resources with the ability to develop and train such models. One of the main challenges for the long tail of researchers who might have access to only limited resources (e.g., a single multi-GPU server) is limited GPU memory capacity compared to model size. The problem is so acute that the memory requirement of training large DNN models can often exceed the aggregate capacity of all available GPUs on commodity servers; this problem only gets worse with the trend of ever-growing model sizes. Current solutions that rely on virtualizing GPU memory (by swapping to/from CPU memory) incur excessive swapping overhead. In this paper, we advocate rethinking how DNN frameworks schedule computation and move data to push the boundaries of training large models efficiently on modest multi-GPU deployments.","PeriodicalId":224944,"journal":{"name":"Proceedings of the Workshop on Hot Topics in Operating Systems","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124422706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}