{"title":"New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice","authors":"Cristian Estan, G. Varghese","doi":"10.1145/859716.859719","DOIUrl":"https://doi.org/10.1145/859716.859719","url":null,"abstract":"Accurate network traffic measurement is required for accounting, bandwidth provisioning and detecting DoS attacks. These applications see the traffic as a collection of flows they need to measure. As link speeds and the number of flows increase, keeping a counter for each flow is too expensive (using SRAM) or slow (using DRAM). The current state-of-the-art methods (Cisco's sampled NetFlow), which count periodically sampled packets are slow, inaccurate and resource-intensive. Previous work showed that at different granularities a small number of \"heavy hitters\" accounts for a large share of traffic. Our paper introduces a paradigm shift by concentrating the measurement process on large flows only---those above some threshold such as 0.1% of the link capacity.We propose two novel and scalable algorithms for identifying the large flows: sample and hold and multistage filters, which take a constant number of memory references per packet and use a small amount of memory. If M is the available memory, we show analytically that the errors of our new algorithms are proportional to 1/M; by contrast, the error of an algorithm based on classical sampling is proportional to 1/&sqrt;M, thus providing much less accuracy for the same amount of memory. We also describe optimizations such as early removal and conservative update that further improve the accuracy of our algorithms, as measured on real traffic traces, by an order of magnitude. Our schemes allow a new form of accounting called threshold accounting in which only flows above a threshold are charged by usage while the rest are charged a fixed fee. Threshold accounting generalizes usage-based and duration based pricing.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"29 1","pages":"270-313"},"PeriodicalIF":1.5,"publicationDate":"2003-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79767074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining","authors":"R. V. Renesse, K. Birman, W. Vogels","doi":"10.1145/762483.762485","DOIUrl":"https://doi.org/10.1145/762483.762485","url":null,"abstract":"Scalable management and self-organizational capabilities areemerging as central requirements for a generation of large-scale,highly dynamic, distributed applications. We have developed anentirely new distributed information management system calledAstrolabe. Astrolabe collects large-scale system state, permittingrapid updates and providing on-the-fly attribute aggregation. Thislatter capability permits an application to locate a resource, andalso offers a scalable way to track system state as it evolves overtime. The combination of features makes it possible to solve a widevariety of management and self-configuration problems. This paperdescribes the design of the system with a focus upon itsscalability. After describing the Astrolabe service, we presentexamples of the use of Astrolabe for locating resources,publish-subscribe, and distributed synchronization in largesystems. Astrolabe is implemented using a peer-to-peer protocol,and uses a restricted form of mobile code based on the SQL querylanguage for aggregation. This protocol gives rise to a novelconsistency model. Astrolabe addresses several securityconsiderations using a built-in PKI. The scalability of the systemis evaluated using both simulation and experiments; these confirmthat Astrolabe could scale to thousands and perhaps millions ofnodes, with information propagation delays in the tens of seconds.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"93 1","pages":"164-206"},"PeriodicalIF":1.5,"publicationDate":"2003-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75660655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mor Harchol-Balter, Bianca Schroeder, N. Bansal, Mukesh Agrawal
{"title":"Size-based scheduling to improve web performance","authors":"Mor Harchol-Balter, Bianca Schroeder, N. Bansal, Mukesh Agrawal","doi":"10.1145/762483.762486","DOIUrl":"https://doi.org/10.1145/762483.762486","url":null,"abstract":"Is it possible to reduce the expected response time of every request at a web server, simply by changing the order in which we schedule the requests? That is the question we ask in this paper.This paper proposes a method for improving the performance of web servers servicing static HTTP requests. The idea is to give preference to requests for small files or requests with short remaining file size, in accordance with the SRPT (Shortest Remaining Processing Time) scheduling policy.The implementation is at the kernel level and involves controlling the order in which socket buffers are drained into the network. Experiments are executed both in a LAN and a WAN environment. We use the Linux operating system and the Apache and Flash web servers.Results indicate that SRPT-based scheduling of connections yields significant reductions in delay at the web server. These result in a substantial reduction in mean response time and mean slowdown for both the LAN and WAN environments. Significantly, and counter to intuition, the requests for large files are only negligibly penalized or not at all penalized as a result of SRPT-based scheduling.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"14 1","pages":"207-233"},"PeriodicalIF":1.5,"publicationDate":"2003-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88463246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Run-time adaptation in river","authors":"Remzi H. Arpaci-Dusseau","doi":"10.1145/592637.592639","DOIUrl":"https://doi.org/10.1145/592637.592639","url":null,"abstract":"We present the design, implementation, and evaluation of run-time adaptation within the River dataflow programming environment. The goal of the River system is to provide adaptive mechanisms that allow database query-processing applications to cope with performance variations that are common in cluster platforms. We describe the system and its basic mechanisms, and carefully evaluate those mechanisms and their effectiveness. In our analysis, we answer four previously unanswered and important questions. Are the core run-time adaptive mechanisms effective, especially as compared to the ideal? What are the keys to making them work well? Can applications easily use these primitives? And finally, are there situations in which run-time adaptation is not sufficient? In performing our study, we utilize a three-pronged approach, comparing results from idealized models of system behavior, targeted simulations, and a prototype implementation. As well as providing insight on the positives and negatives of run-time adaptation both specifically in River and in a broader context, we also comment on the interplay of modeling, simulation, and implementation in system design.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"38 1","pages":"36-86"},"PeriodicalIF":1.5,"publicationDate":"2003-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85695893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Y. C. Hu, Weimin Yu, A. Cox, D. Wallach, W. Zwaenepoel
{"title":"Run-time support for distributed sharing in safe languages","authors":"Y. C. Hu, Weimin Yu, A. Cox, D. Wallach, W. Zwaenepoel","doi":"10.1145/592637.592638","DOIUrl":"https://doi.org/10.1145/592637.592638","url":null,"abstract":"We present a new run-time system that supports object sharing in a distributed system. The key insight in this system is that a handle-based implementation of such a system enables efficient and transparent sharing of data with both fine- and coarse-grained access patterns. In addition, it supports efficient execution of garbage-collected programs. In contrast, conventional distributed shared memory (DSM) systems are limited to providing only one granularity with good performance, and have experienced difficulty in efficiently supporting garbage collection. A safe language, in which no pointer arithmetic is allowed, can transparently be compiled into a handle-based system and constitutes its preferred mode of use. A programmer can also directly use a handle-based programming model that avoids pointer arithmetic on the handles, and achieve the same performance but without the programming benefits of a safe programming language. This new run-time system, DOSA (Distributed Object Sharing Architecture), provides a shared object space abstraction rather than a shared address space abstraction. The key to its efficiency is the observation that a handle-based distributed implementation permits VM-based access and modification detection without suffering false sharing for fine-grained access patterns. We compare DOSA to TreadMarks, a conventional DSM system that is efficient at handling coarse-grained sharing. The performance of fine-grained applications and garbage-collected applications is considerably better than in TreadMarks, and the performance of coarse-grained applications is nearly as good as in TreadMarks. Inasmuch as the performance of such applications is already good in TreadMarks, we consider this an acceptable performance penalty.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"18 1","pages":"1-35"},"PeriodicalIF":1.5,"publicationDate":"2003-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84193600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Zdancewic, Lantian Zheng, Nathaniel Nystrom, A. Myers
{"title":"Secure program partitioning","authors":"S. Zdancewic, Lantian Zheng, Nathaniel Nystrom, A. Myers","doi":"10.1145/566340.566343","DOIUrl":"https://doi.org/10.1145/566340.566343","url":null,"abstract":"This paper presents secure program partitioning, a language-based technique for protecting confidential data during computation in distributed systems containing mutually untrusted hosts. Confidentiality and integrity policies can be expressed by annotating programs with security types that constrain information flow; these programs can then be partitioned automatically to run securely on heterogeneously trusted hosts. The resulting communicating subprograms collectively implement the original program, yet the system as a whole satisfies the security requirements of participating principals without requiring a universally trusted host machine. The experience in applying this methodology and the performance of the resulting distributed code suggest that this is a promising way to obtain secure distributed computation.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"1 1","pages":"283-328"},"PeriodicalIF":1.5,"publicationDate":"2002-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83709451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and evaluation of a conit-based continuous consistency model for replicated services","authors":"Haifeng Yu, Amin Vahdat","doi":"10.1145/566340.566342","DOIUrl":"https://doi.org/10.1145/566340.566342","url":null,"abstract":"The tradeoffs between consistency, performance, and availability are well understood. Traditionally, however, designers of replicated systems have been forced to choose from either strong consistency guarantees or none at all. This paper explores the semantic space between traditional strong and optimistic consistency models for replicated services. We argue that an important class of applications can tolerate relaxed consistency, but benefit from bounding the maximum rate of inconsistent access in an application-specific manner. Thus, we develop a conit-based continuous consistency model to capture the consistency spectrum using three application-independent metrics, numerical error, order error, and staleness. We then present the design and implementation of TACT, a middleware layer that enforces arbitrary consistency bounds among replicas using these metrics. We argue that the TACT consistency model can simultaneously achieve the often conflicting goals of generality and practicality by describing how a broad range of applications can express their consistency semantics using TACT and by demonstrating that application-independent algorithms can efficiently enforce target consistency levels. Finally, we show that three replicated applications running across the Internet demonstrate significant semantic and performance benefits from using our framework.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"11 1","pages":"239-282"},"PeriodicalIF":1.5,"publicationDate":"2002-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88606646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Keidar, Jeremy B. Sussman, K. Marzullo, D. Dolev
{"title":"Moshe: A group membership service for WANs","authors":"I. Keidar, Jeremy B. Sussman, K. Marzullo, D. Dolev","doi":"10.1145/566340.566341","DOIUrl":"https://doi.org/10.1145/566340.566341","url":null,"abstract":"We present Moshe, a novel scalable group membership algorithm built specifically for use in wide area networks (WANs), which can suffer partitions. Moshe is designed with three new significant features that are important in this setting: it avoids delivering views that reflect out-of-date memberships; it requires a single round of messages in the common case; and it employs a client-server design for scalability. Furthermore, Moshe's interface supplies the hooks needed to provide clients with full virtual synchrony semantics. We have implemented Moshe on top of a network event mechanism also designed specifically for use in a WAN. In addition to specifying the properties of the algorithm and proving that this specification is met, we provide empirical results of an implementation of Moshe running over the Internet. The empirical results justify the assumptions made by our design and exhibit good performance. In particular, Moshe terminates within a single communication round over 98% of the time. The experimental results also lead to interesting observations regarding the performance of membership algorithms over the Internet.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"82 1","pages":"191-238"},"PeriodicalIF":1.5,"publicationDate":"2002-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75951829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Burgess, H. Haugerud, Sigmund Straumsnes, T. Reitan
{"title":"Measuring system normality","authors":"M. Burgess, H. Haugerud, Sigmund Straumsnes, T. Reitan","doi":"10.1145/507052.507054","DOIUrl":"https://doi.org/10.1145/507052.507054","url":null,"abstract":"A comparative analysis of transaction time-series is made, for light to moderately loaded hosts, motivated by the problem of anomaly detection in computers. Criteria for measuring the statistical state of hosts are examined. Applying a scaling transformation to the measured data, it is found that the distribution of fluctuations about the mean is closely approximated by a steady-state, maximum-entropy distribution, modulated by a periodic variation. The shape of the distribution, under these conditions, depends on the dimensionless ratio of the daily/weekly periodicity and the correlation length of the data. These values are persistent or even invariant. We investigate the limits of these conclusions, and how they might be applied in anomaly detection.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"96 1","pages":"125-160"},"PeriodicalIF":1.5,"publicationDate":"2002-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84216180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Let caches decay: reducing leakage energy via exploitation of cache generational behavior","authors":"Zhigang Hu, S. Kaxiras, M. Martonosi","doi":"10.1145/507052.507055","DOIUrl":"https://doi.org/10.1145/507052.507055","url":null,"abstract":"Power dissipation is increasingly important in CPUs ranging from those intended for mobile use, all the way up to high-performance processors for highend servers. Although the bulk of the power dissipated is dynamic switching power, leakage power is also beginning to be a concern. Chipmakers expect that in future chip generations, leakage's proportion of total chip power will increase significantly. This article examines methods for reducing leakage power within the cache memories of the CPU. Because caches comprise much of a CPU chip's area and transistor counts, they are reasonable targets for attacking leakage. We discuss policies and implementations for reducing cache leakage by invalidating and \"turning off\" cache lines when they hold data not likely to be reused. In particular, our approach is targeted at the generational nature of cache line usage. That is, cache lines typically have a flurry of frequent use when first brought into the cache, and then have a period of \"dead time\" before they are evicted. By devising effective, low-power ways of deducing dead time, our results show that in many cases we can reduce L1 cache leakage energy by 4x in SPEC2000 applications without having an impact on performance. Because our decay-based techniques have notions of competitive online algorithms at their roots, their energy usage can be theoretically bounded at within a factor of two of the optimal oracle-based policy. We also examine adaptive decay-based policies that make energy-minimizing policy choices on a per-application basis by choosing appropriate decay intervals individually for each cache line. Our proposed adaptive policies effectively reduce L1 cache leakage energy by 5x for the SPEC2000 with only negligible degradations in performance.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"21 1","pages":"161-190"},"PeriodicalIF":1.5,"publicationDate":"2002-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78834103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}