{"title":"Clustering Streaming Graphs","authors":"A. Eldawy, R. Khandekar, Kun-Lung Wu","doi":"10.1109/ICDCS.2012.20","DOIUrl":"https://doi.org/10.1109/ICDCS.2012.20","url":null,"abstract":"In this paper, we propose techniques for clustering large-scale \"streaming\" graphs where the updates to a graph are given in form of a stream of vertex or edge additions and deletions. Our algorithm handles such updates in an online and incremental manner and it can be easily parallel zed. Several previous graph clustering algorithms fall short of handling massive and streaming graphs because they are centralized, they need to know the entire graph beforehand and are not incremental, or they incur an excessive computational overhead. Our algorithm's fundamental building block is called graph reservoir sampling. We maintain a reservoir sample of the edges as the graph changes while satisfying certain desired properties like bounding number of clusters or cluster-sizes. We then declare connected components in the sampled sub graph as clusters of the original graph. Our experiments on real graphs show that our approach not only yields clusterings with very good quality, but also obtains orders of magnitude higher throughput, when compared to offline algorithms.","PeriodicalId":6300,"journal":{"name":"2012 IEEE 32nd International Conference on Distributed Computing Systems","volume":"58 1","pages":"466-475"},"PeriodicalIF":0.0,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78863801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pei-chun Cheng, Jong Han Park, K. Patel, S. Amante, Lixia Zhang
{"title":"Explaining BGP Slow Table Transfers","authors":"Pei-chun Cheng, Jong Han Park, K. Patel, S. Amante, Lixia Zhang","doi":"10.1109/ICDCS.2012.14","DOIUrl":"https://doi.org/10.1109/ICDCS.2012.14","url":null,"abstract":"Although there have been a plethora of studies on TCP performance in supporting of various applications, relatively little is known about the interaction between TCP and BGP, which is a specific application running on top of TCP. This paper investigates BGP's slow route propagation by analyzing packet traces collected from a large ISP and Route Views Oregon collector. In particular we focus on the prolonged periods of BGP routing table transfers and examine in detail the interplay between TCP and BGP. In addition to the problems reported in previous literature, this study reveals a number of new TCP transport problems, that collectively induce significant delays. Furthermore, we develop a tool, named T-DAT, that can be deployed together with BGP data collectors to infer various factors behind the observed delay, including BGP's sending and receiving behavior, TCP's parameter settings, TCP's flow and congestion control, and network path limitation. Identifying these delay contributing factors makes an important step for ISPs and router vendors to diagnose and improve the BGP performance.","PeriodicalId":6300,"journal":{"name":"2012 IEEE 32nd International Conference on Distributed Computing Systems","volume":"387 1","pages":"657-666"},"PeriodicalIF":0.0,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74291845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhu Ren, Peng Cheng, Jiming Chen, David K. Y. Yau, Youxian Sun
{"title":"Dynamic Activation Policies for Event Capture with Rechargeable Sensors","authors":"Zhu Ren, Peng Cheng, Jiming Chen, David K. Y. Yau, Youxian Sun","doi":"10.1109/ICDCS.2012.70","DOIUrl":"https://doi.org/10.1109/ICDCS.2012.70","url":null,"abstract":"We consider the problem of event capture by a rechargeable sensor network. We assume that the events of interest follow a renewal process whose event inter-arrival times are drawn from a general probability distribution, and that a stochastic recharge process is used to provide energy for the sensors' operation. Dynamics of the event and recharge processes make the optimal sensor activation problem highly challenging. In this paper we first consider the single-sensor problem. Using dynamic control theory, we consider a full-information model in which, independent of its activation schedule, the sensor will know whether an event has occurred in the last time slot or not. In this case, the problem is framed as a Markov decision process (MDP), and we develop a simple and optimal policy for the solution. We then further consider a partial-information model where the sensor knows about the occurrence of an event only when it is active. This problem falls into the class of partially observable Markov decision processes (POMDP). Since the POMDP's optimal policy has exponential computational complexity and is intrinsically hard to solve, we propose an efficient heuristic clustering policy and evaluate its performance. Finally, our solutions are extended to handle a network setting in which multiple sensors collaborate to capture the events. We provide extensive simulation results to evaluate the performance of our solutions.","PeriodicalId":6300,"journal":{"name":"2012 IEEE 32nd International Conference on Distributed Computing Systems","volume":"9 1","pages":"152-162"},"PeriodicalIF":0.0,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81261461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Wang, Keqiang He, Huichen Dai, Wei Meng, Junchen Jiang, B. Liu, Yan Chen
{"title":"Scalable Name Lookup in NDN Using Effective Name Component Encoding","authors":"Yi Wang, Keqiang He, Huichen Dai, Wei Meng, Junchen Jiang, B. Liu, Yan Chen","doi":"10.1109/ICDCS.2012.35","DOIUrl":"https://doi.org/10.1109/ICDCS.2012.35","url":null,"abstract":"Name-based route lookup is a key function for Named Data Networking (NDN). The NDN names are hierarchical and have variable and unbounded lengths, which are much longer than IPv4/6 address, making fast name lookup a challenging issue. In this paper, we propose an effective Name Component Encoding (NCE) solution with the following two techniques: (1) A code allocation mechanism is developed to achieve memory-efficient encoding for name components, (2) We apply an improved State Transition Arrays to accelerate the longest name prefix matching and design a fast and incremental update mechanism which satisfies the special requirements of NDN forwarding process, namely to insert, modify, and delete name prefixes frequently. Furthermore, we analyze the memory consumption and time complexity of NCE. Experimental results on a name set containing 3,000,000 names demonstrate that compared with the character trie NCE reduces overall 30% memory. Besides, NCE performs a few millions lookups per second (on an Intel 2.8 GHz CPU), a speedup of over 7 times compared with the character trie. Our evaluation results also show that NCE can scale up to accommodate the potential future growth of the name sets.","PeriodicalId":6300,"journal":{"name":"2012 IEEE 32nd International Conference on Distributed Computing Systems","volume":"19 1","pages":"688-697"},"PeriodicalIF":0.0,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89069168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Growing Secure Distributed Systems from a Spore","authors":"Yunus Basagalar, Vassilios Lekakis, P. Keleher","doi":"10.1109/ICDCS.2012.68","DOIUrl":"https://doi.org/10.1109/ICDCS.2012.68","url":null,"abstract":"This paper describes the design and evaluation of Spore, a secure cloud-based file system that minimizes trust and functionality assumptions on underlying servers. Spore differs from other systems in that system relationships are formalized only through signed data objects, rather than in complicated protocols executed between clients and servers. This approach allows Spore to bootstrap a file system from a single object, providing integrity and security guarantees while storing all data as simple, immutable objects on untrusted servers. We use simulation to characterize the performance of this system, focusing primarily on the cost incurred in compensating for the minimal server support. We show that while a naive approach is quite inefficient, a series of simple optimizations can enable the system to perform well in real-world scenarios.","PeriodicalId":6300,"journal":{"name":"2012 IEEE 32nd International Conference on Distributed Computing Systems","volume":"188 ","pages":"546-555"},"PeriodicalIF":0.0,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91450269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ADAPT: Availability-Aware MapReduce Data Placement for Non-dedicated Distributed Computing","authors":"Hui Jin, Xi Yang, Xian-He Sun, I. Raicu","doi":"10.1109/ICDCS.2012.48","DOIUrl":"https://doi.org/10.1109/ICDCS.2012.48","url":null,"abstract":"The MapReduce programming paradigm is gaining more and more popularity recently due to its merits of ease of programming, data distribution and fault tolerance. The low barrier of adoption of MapReduce makes it a promising framework for non-dedicated distributed computing environments. However, the variability of hosts resources and availability could substantially degrade the performance of MapReduce applications. The replication-based fault tolerance mechanism helps to alleviate some problems at the cost of inefficient storage space utilization. Intelligent solutions that guarantee the performance of MapReduce applications with low data replication degree are needed to promote the idea of running MapReduce applications in non-dedicated environment at lower costs. In this research, we propose an Availability-aware Data Placement (ADAPT) strategy to improve the application performance without extra storage cost. The basic idea of ADAPT is to dispatch data based on the availability of each node, reduce network traffic, improve data locality, and optimize the application performance. We implement the prototype of ADAPT within the Hadoop framework, an open-source implementation of MapReduce. The performance of ADAPT is evaluated in an emulated non-dedicated distributed environment. The experimental results show that ADAPT can improve the performance by more than 30%. ADAPT achieves high reliability without the need for additional data replication. ADAPT has also been evaluated for large-scale computing environment through simulations, with promising results.","PeriodicalId":6300,"journal":{"name":"2012 IEEE 32nd International Conference on Distributed Computing Systems","volume":"20 1","pages":"516-525"},"PeriodicalIF":0.0,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81145719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PAAS: A Privacy-Preserving Attribute-Based Authentication System for eHealth Networks","authors":"Linke Guo, Chi Zhang, Jinyuan Sun, Yuguang Fang","doi":"10.1109/ICDCS.2012.45","DOIUrl":"https://doi.org/10.1109/ICDCS.2012.45","url":null,"abstract":"Recently, eHealth systems have replaced paper based medical system due to its prominent features of convenience and accuracy. Also, since the medical data can be stored on any kind of digital devices, people can easily obtain medical services at any time and any place. However, privacy concern over patient medical data draws an increasing attention. In the current eHealth networks, patients are assigned multiple attributes which directly reflect their symptoms, undergoing treatments, etc. Those life-threatened attributes need to be verified by an authorized medical facilities, such as hospitals and clinics. When there is a need for medical services, patients have to be authenticated by showing their identities and the corresponding attributes in order to take appropriate healthcare actions. However, directly disclosing those attributes for verification may expose real identities. Therefore, existing eHealth systems fail to preserve patients' private attribute information while maintaining original functionalities of medical services. To solve this dilemma, we propose a framework called PAAS which leverages users' verifiable attributes to authenticate users in eHealth systems while preserving their privacy issues. In our system, instead of letting centralized infrastructures take care of authentication, our scheme only involves two end users. We also offer authentication strategies with progressive privacy requirements among patients or between patients and physicians. Based on the security and efficiency analysis, we show our framework is better than existing eHealth systems in terms of privacy preservation and practicality.","PeriodicalId":6300,"journal":{"name":"2012 IEEE 32nd International Conference on Distributed Computing Systems","volume":"62 1","pages":"224-233"},"PeriodicalIF":0.0,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80923070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongmin Tan, H. Nguyen, Zhiming Shen, Xiaohui Gu, C. Venkatramani, D. Rajan
{"title":"PREPARE: Predictive Performance Anomaly Prevention for Virtualized Cloud Systems","authors":"Yongmin Tan, H. Nguyen, Zhiming Shen, Xiaohui Gu, C. Venkatramani, D. Rajan","doi":"10.1109/ICDCS.2012.65","DOIUrl":"https://doi.org/10.1109/ICDCS.2012.65","url":null,"abstract":"Virtualized cloud systems are prone to performance anomalies due to various reasons such as resource contentions, software bugs, and hardware failures. In this paper, we present a novel Predictive Performance Anomaly Prevention (PREPARE) system that provides automatic performance anomaly prevention for virtualized cloud computing infrastructures. PREPARE integrates online anomaly prediction, learning-based cause inference, and predictive prevention actuation to minimize the performance anomaly penalty without human intervention. We have implemented PREPARE on top of the Xen platform and tested it on the NCSU's Virtual Computing Lab using a commercial data stream processing system (IBM System S) and an online auction benchmark (RUBiS). The experimental results show that PREPARE can effectively prevent performance anomalies while imposing low overhead to the cloud infrastructure.","PeriodicalId":6300,"journal":{"name":"2012 IEEE 32nd International Conference on Distributed Computing Systems","volume":"3 1","pages":"285-294"},"PeriodicalIF":0.0,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80023896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Byte Caching in Wireless Networks","authors":"Franck Le, M. Srivatsa, A. Iyengar","doi":"10.1109/ICDCS.2012.39","DOIUrl":"https://doi.org/10.1109/ICDCS.2012.39","url":null,"abstract":"The explosion of data consumption has led to a renewed interest in byte caching. With studies showing potential reductions in network traffic of 50%, this fine grained caching technique looks like a very good and attractive solution for mobile wireless operators. However, properties of wireless networks actually present new challenges. We first show that a single packet loss, re-ordering or corruption -- all common conditions over the air interface -- can result in circular dependencies and cause existing byte caching algorithms to loop endlessly. To remedy the problem, we then explore a new set of encoding algorithms. Third, we assess the impact of packet losses on byte caching performances, both in terms of byte savings and delay reduction. We found that a mere 1% packet loss can already nullify any delay reduction and instead cause significant increases that users may not be willing to tolerate. Finally, we shared several insights, including interactions between transport layer protocol's mechanisms (e.g., TCP window congestion) and byte caching operations that can cause sophisticated encoding algorithms to perform poorly. We believe that these insights are important for designing more efficient and robust byte caching encoding algorithms.","PeriodicalId":6300,"journal":{"name":"2012 IEEE 32nd International Conference on Distributed Computing Systems","volume":"55 1","pages":"265-274"},"PeriodicalIF":0.0,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73594372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sebastiano Peluso, P. Ruivo, P. Romano, F. Quaglia, L. Rodrigues
{"title":"When Scalability Meets Consistency: Genuine Multiversion Update-Serializable Partial Data Replication","authors":"Sebastiano Peluso, P. Ruivo, P. Romano, F. Quaglia, L. Rodrigues","doi":"10.1109/ICDCS.2012.55","DOIUrl":"https://doi.org/10.1109/ICDCS.2012.55","url":null,"abstract":"In this article we introduce GMU, a genuine partial replication protocol for transactional systems, which exploits an innovative, highly scalable, distributed multiversioning scheme. Unlike existing multiversion-based solutions, GMU does not rely on a global logical clock, which represents a contention point and can limit system scalability. Also, GMU never aborts read-only transactions and spares them from distributed validation schemes. This makes GMU particularly efficient in presence of read-intensive workloads, as typical of a wide range of real-world applications. GMU guarantees the Extended Update Serializability (EUS) isolation level. This consistency criterion is particularly attractive as it is sufficiently strong to ensure correctness even for very demanding applications (such as TPC-C), but is also weak enough to allow efficient and scalable implementations, such as GMU. Further, unlike several relaxed consistency models proposed in literature, EUS has simple and intuitive semantics, thus being an attractive, scalable consistency model for ordinary programmers. We integrated the GMU protocol in a popular open source in-memory transactional data grid, namely Infinispan. On the basis of a large scale experimental study performed on heterogeneous experimental platforms and using industry standard benchmarks (namely TPC-C and YCSB), we show that GMU achieves linear scalability and that it introduces negligible overheads (less than 10%), with respect to solutions ensuring non-serializable semantics, in a wide range of workloads.","PeriodicalId":6300,"journal":{"name":"2012 IEEE 32nd International Conference on Distributed Computing Systems","volume":"91 2 1","pages":"455-465"},"PeriodicalIF":0.0,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90168365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}