R. Stets, S. Dwarkadas, N. Hardavellas, G. Hunt, L. Kontothanassis, S. Parthasarathy, M. Scott
{"title":"Cashmere-2L: software coherent shared memory on a clustered remote-write network","authors":"R. Stets, S. Dwarkadas, N. Hardavellas, G. Hunt, L. Kontothanassis, S. Parthasarathy, M. Scott","doi":"10.1145/268998.266675","DOIUrl":"https://doi.org/10.1145/268998.266675","url":null,"abstract":"Low-latency remote-write networks, such as DEC's Memory Channel, provide the possibility of transparent, inexpensive, large-scale shared-memory parallel computing on clusters of shared memory multiprocessors (SMPs). The challenge is to take advantage of hardware shared memory for sharing within an SMP, and to ensure that software overhead is incurred only when actively sharing data across SMPs in the cluster. In this paper, we describe a two-level software coherent shared memory system-Cashmere-2L-that meets this challenge. Cashmere-2L uses hardware to share memory within a node, while exploiting the Memory Channel's remote-write capabilities to implement moderately lazy release consistency with multiple concurrent writers, directories, home nodes, and page-size coherence blocks across nodes. Cashmere-2L employs a novel coherence protocol that allows a high level of asynchrony by eliminating global directory locks and the need for TLB shootdown. Remote interrupts are minimized by exploiting the remote-write capabilities of the Memory Channel network. Cashmere-2L currently runs on an 8-node, 32-processor DEC AlphaServer system. Speedups range from 8 to 31 on 32 processors for our benchmark suite, depending on the application's characteristics. We quantify the importance of our protocol optimizations by comparing performance to that of several alternative protocols that do not share memory in hardware within an SMP, and require more synchronization. In comparison to a one-level protocol that does not share memory in hardware within an SMP, Cashmere-2L improves performance by up to 46%.","PeriodicalId":340271,"journal":{"name":"Proceedings of the sixteenth ACM symposium on Operating systems principles","volume":"272 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129916228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CPU reservations and time constraints: efficient, predictable scheduling of independent activities","authors":"Michael B. Jones, D. Rosu, Marcel-Catalin Rosu","doi":"10.1145/268998.266689","DOIUrl":"https://doi.org/10.1145/268998.266689","url":null,"abstract":"Workstations and personal computers are increasingly being used for applications with real-time characteristics such as speech understanding and synthesis, media computations and I/O, and animation, often concurrently executed with traditional non-real-time workloads. This paper presents a system that can schedule multiple independent activities so that: . activities can obtain minimum guaranteed execution rates with application-specified reservation granularities via CPU Reservations, CPU Reservations, which are of the form reserve X units of time out of every Y units, provide not just an average case execution rate of X/Y over long periods of time, but the stronger guarantee that from any instant of time, by Y time units later, the activity will have executed for at least X time units, . applications can use Time Constraints to schedule tasks by deadlines, with on-time completion guaranteed for tasks with accepted constraints, and . both CPU Reservations and Time Constraints are implemented very efficiently. In particular, . CPU scheduling overhead is bounded by a constant and is not a function of the number of schedulable tasks. Other key scheduler properties are: . activities cannot violate other activities' guarantees, . time constraints and CPU reservations may be used together, separately, or not at all (which gives a round-robin schedule), with well-defined interactions between all combinations, and . spare CPU time is fairly shared among all activities. The Rialto operating system, developed at Microsoft Research, achieves these goals by using a precomputed schedule, which is the fundamental basis of this work.","PeriodicalId":340271,"journal":{"name":"Proceedings of the sixteenth ACM symposium on Operating systems principles","volume":"208 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133636330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Free transactions with Rio Vista","authors":"David E. Lowell, Peter M. Chen","doi":"10.1145/268998.266665","DOIUrl":"https://doi.org/10.1145/268998.266665","url":null,"abstract":"Transactions and recoverable memories are powerful mechanisms for handling failures and manipulating persistent data. Unfortunately, standard recoverable memories incur an overhead of several milliseconds per transaction. This paper presents a system that improves transaction overhead by a factor of 2000 for working sets that fit in main memory. Of this factor of 2000, a factor of 20 is due to the Rio file cache, which absorbs synchronous writes to disk without losing data during system crashes. The remaining factor of 100 is due to Vista, a 720-line, recoverable-memory library tailored for Rio. Vista lowers transaction overhead to 5 μsec by using no redo log, no system calls, and only one memory-to-memory copy. This drastic reduction in overhead leads to a overall speedup of 150-556x for benchmarks based on TPC-B and TPC-C.","PeriodicalId":340271,"journal":{"name":"Proceedings of the sixteenth ACM symposium on Operating systems principles","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115615607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Fox, S. Gribble, Y. Chawathe, E. Brewer, P. Gauthier
{"title":"Cluster-based scalable network services","authors":"A. Fox, S. Gribble, Y. Chawathe, E. Brewer, P. Gauthier","doi":"10.1145/268998.266662","DOIUrl":"https://doi.org/10.1145/268998.266662","url":null,"abstract":"We identify three fundamental requirements for scalable network services: incremental scalability and overflow growth provisioning, 24x7 availability through fault masking, and cost-effectiveness. We argue that clusters of commodity workstations interconnected by a high-speed SAN are exceptionally well-suited to meeting these challenges for Internet-server workloads, provided the software infrastructure for managing partial failures and administering a large cluster does not have to be reinvented for each new service. To this end, we propose a general, layered architecture for building cluster-based scalable network services that encapsulates the above requirements for reuse, and a service-programming model based on composable workers that perform transformation, aggregation, caching, and customization (TACC) of Internet content. For both performance and implementation simplicity, the architecture and TACC programming model exploit BASE, a weaker-than-ACID data semantics that results from trading consistency for availability and relying on soft state for robustness in failure management. Our architecture can be used as an off the shelf infrastructural platform for creating new network services, allowing authors to focus on the content of the service (by composing TACC building blocks) rather than its implementation. We discuss two real implementations of services based on this architecture: TranSend, a Web distillation proxy deployed to the UC Berkeley dialup IP population, and HotBot, the commercial implementation of the Inktomi search engine. We present detailed measurements of TranSend's performance based on substantial client traces, as well as anecdotal evidence from the TranSend and HotBot experience, to support the claims made for the architecture.","PeriodicalId":340271,"journal":{"name":"Proceedings of the sixteenth ACM symposium on Operating systems principles","volume":"152 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121336634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Petersen, M. Spreitzer, D. Terry, M. Theimer, A. Demers
{"title":"Flexible update propagation for weakly consistent replication","authors":"K. Petersen, M. Spreitzer, D. Terry, M. Theimer, A. Demers","doi":"10.1145/268998.266711","DOIUrl":"https://doi.org/10.1145/268998.266711","url":null,"abstract":"Bayou's anti-entropy protocol for update propagation between weakly consistent storage replicas is based on pair-wise communication, the propagation of write operations, and a set of ordering and closure constraints on the propagation of the writes. The simplicity of the design makes the protocol very flexible, thereby providing support for diverse networking environments and usage scenarios. It accommodates a variety of policies for when and where to propagate updates. It operates over diverse network topologies, including low-bandwidth links. It is incremental. It enables replica convergence, and updates can be propagated using floppy disks and similar transportable media. Moreover, the protocol handles replica creation and retirement in a light-weight manner. Each of these features is enabled by only one or two of the protocol's design choices, and can be independently incorporated in other systems. This paper presents the anti-entropy protocol in detail, describing the design decisions and resulting features.","PeriodicalId":340271,"journal":{"name":"Proceedings of the sixteenth ACM symposium on Operating systems principles","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131885090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Kaashoek, K. Mackenzie, D. Engler, G. Ganger, Héctor M. Briceño, R. Hunt, David Mazières, T. Pinckney, R. Grimm, John Jannotti
{"title":"Application performance and flexibility on exokernel systems","authors":"M. Kaashoek, K. Mackenzie, D. Engler, G. Ganger, Héctor M. Briceño, R. Hunt, David Mazières, T. Pinckney, R. Grimm, John Jannotti","doi":"10.1145/268998.266644","DOIUrl":"https://doi.org/10.1145/268998.266644","url":null,"abstract":"The exokernel operating system architecture safely gives untrusted software efficient control over hardware and software resou rces by separating management from protection. This paper describes an exokernel system that allows specialized applications to achieve high performance without sacrificing the performance of unm odified UNIX programs. It evaluates the exokernel architectur e by measuring end-to-end application performance on Xok, an exokernel for Intel x86-based computers, and by comparing Xok’s performance to the performance of two widely-used 4.4BSD UNIX systems (FreeBSD and OpenBSD). The results show that common unmodified UNIX applications can enjoy the benefits of exoker nels: applications either perform comparably on Xok/ExOS and the BSD UNIXes, or perform significantly better. In addition , the results show that customized applications can benefit subst antially from control over their resources (e.g., a factor of eight fo r a Web server). This paper also describes insights about the exokernel approach gained through building three different exokernel systems, and presents novel approaches to resource multiplexing.","PeriodicalId":340271,"journal":{"name":"Proceedings of the sixteenth ACM symposium on Operating systems principles","volume":"157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124314874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brian D. Noble, M. Satyanarayanan, D. Narayanan, James Eric Tilton, J. Flinn, Kevin R. Walker
{"title":"Agile application-aware adaptation for mobility","authors":"Brian D. Noble, M. Satyanarayanan, D. Narayanan, James Eric Tilton, J. Flinn, Kevin R. Walker","doi":"10.1145/268998.266708","DOIUrl":"https://doi.org/10.1145/268998.266708","url":null,"abstract":"In this paper we show that application-aware adaptation, a collaborative partnership between the operating system and applications, offers the most general and effective approach to mobile information access. We describe the design of Odyssey, a prototype implementing this approach, and show how it supports concurrent execution of diverse mobile applications. We identify agility as a key attribute of adaptive systems, and describe how to quantify and measure it. We present the results of our evaluation of Odyssey, indicating performance improvements up to a factor of 5 on a benchmark of three applications concurrently using remote services over a network with highly variable bandwidth.","PeriodicalId":340271,"journal":{"name":"Proceedings of the sixteenth ACM symposium on Operating systems principles","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122722205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploiting the non-determinism and asynchrony of set iterators to reduce aggregate file I/O latency","authors":"D. Steere","doi":"10.1145/268998.266705","DOIUrl":"https://doi.org/10.1145/268998.266705","url":null,"abstract":"A key goal of distributed systems is to provide prompt access to shared information repositories. The high latency ofremote access is a serious impediment to this goal. This paper describes a new file system abstraction called dynamic sets - unordered collections created by an application to hold the files it intends to process. Applications that iterate on the set to access its members allow the system to reduce the aggregate I/O latency by exploiting the non-determinism and asychrony inherent in the semantics of set iterators. This reduction in latency comes without relying on reference locality, without modifying DFS servers and protocols, and without unduly complicating the programming model. This paper presents this abstraction and describes an implementation of it that runs on local and distributed file systems, as well as the World Wide Web. Dynamic sets demonstrate substantial performance gains - up to 50% savings in runtime for search on NFS, and up to 90% reduction in 1/O latency for Web searches.","PeriodicalId":340271,"journal":{"name":"Proceedings of the sixteenth ACM symposium on Operating systems principles","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117206484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated hoarding for mobile computers","authors":"G. Kuenning, G. Popek","doi":"10.1145/268998.266706","DOIUrl":"https://doi.org/10.1145/268998.266706","url":null,"abstract":"A common problem facing mobile computing is disconnected operation, or computing in the absence of a network. Hoarding eases disconnected operation by selecting a subset of the user’s files for local storage. We describe a hoarding system that can operate without user intervention, by observing user activity and predicting future needs. The system calculates a new measure, semantic distance, between individual files, and uses this to feed a clustering algorithm that chooses which files should be hoarded. A separate replication system manages the actual transport of data; any of a number of replication systems may be used. We discuss practical problems encountered in the real world and present usage statistics showing that our system outperforms previous approaches by factors that can exceed 10:1.","PeriodicalId":340271,"journal":{"name":"Proceedings of the sixteenth ACM symposium on Operating systems principles","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131161249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jeanna Neefe Matthews, D. Roselli, Adam M. Costello, Randolph Y. Wang, T. Anderson
{"title":"Improving the performance of log-structured file systems with adaptive methods","authors":"Jeanna Neefe Matthews, D. Roselli, Adam M. Costello, Randolph Y. Wang, T. Anderson","doi":"10.1145/268998.266700","DOIUrl":"https://doi.org/10.1145/268998.266700","url":null,"abstract":"File system designers today face a dilemma. A log-structured file system (LFS) can offer superior performance for many common workloads such as those with frequent small writes, read traffic that is predominantly absorbed by the cache, and sufficient idle time to clean the log. However, an LFS has poor performance for other workloads, such as random updates to a full disk with little idle time to clean. In this paper, we show how adaptive algorithms can be used to enable LFS to provide high performance across a wider range of workloads. First, we show how to improve LFS write performance in three ways: by choosing the segment size to match disk and workload characteristics, by modifying the LFS cleaning policy to adapt to changes in disk utilization, and by using cached data to lower cleaning costs. Second, we show how to improve LFS read performance by reorganizing data to match read patterns. Using trace-driven simulations on a combination of synthetic and measured workloads, we demonstrate that these extensions to LFS can significantly improve its performance.","PeriodicalId":340271,"journal":{"name":"Proceedings of the sixteenth ACM symposium on Operating systems principles","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133686632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}