Kun Xie, Lele Wang, Xin Wang, Jigang Wen, Gaogang Xie
{"title":"Learning from the Past: Intelligent On-Line Weather Monitoring Based on Matrix Completion","authors":"Kun Xie, Lele Wang, Xin Wang, Jigang Wen, Gaogang Xie","doi":"10.1109/ICDCS.2014.26","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.26","url":null,"abstract":"Matrix completion has emerged very recently and provides a new venue for low cost data gathering in WSNs. Existing schemes often assume that the data matrix has a known and fixed low-rank, which is unlikely to hold in a practical monitoring system such as weather data gathering. Weather data varies in temporal and spatial domain with time. By analyzing a large set of weather data collected from 196 sensors in ZhuZhou, China, we reveal that weather data have the features of low-rank, temporal stability, and relative rank stability. Taking advantage of these features, we propose an on-line data gathering scheme based on matrix completion theory, named MC-Weather, to adaptively sample different locations according to environmental and weather conditions. To better schedule sampling process while satisfying the required reconstruction accuracy, we propose several novel techniques, including three sample learning principles, an adaptive sampling algorithm based on matrix completion, and a uniform time slot and cross sample model. With these techniques, our MC-Weather scheme can collect the sensory data at required accuracy while largely reduce the cost for sensing, communication and computation. We perform extensive simulations based on the real weather data sets and the simulation results validate the efficiency and efficacy of the proposed scheme.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114503421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Will They Blend?: Exploring Big Data Computation Atop Traditional HPC NAS Storage","authors":"E. Wilson, M. Kandemir, Garth A. Gibson","doi":"10.1109/ICDCS.2014.60","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.60","url":null,"abstract":"The Apache Hadoop framework has rung in a new era in how data-rich organizations can process, store, and analyze large amounts of data. This has resulted in increased potential for an infrastructure exodus from the traditional solution of commercial database ad-hoc analytics on network-attached storage (NAS). While many data-rich organizations can afford to either move entirely to Hadoop for their Big Data analytics, or to maintain their existing traditional infrastructures and acquire a new set of infrastructure solely for Hadoop jobs, most supercomputing centers do not enjoy either of those possibilities. Too much of the existing scientific code is tailored to work on massively parallel file systems unlike the Hadoop Distributed File System (HDFS), and their datasets are too large to reasonably maintain and/or ferry between two distinct storage systems. Nevertheless, as scientists search for easier-to-program frameworks with a lower time-to-science to post-process their huge datasets after execution, there is increasing pressure to enable use of MapReduce within these traditional High Performance Computing (HPC) architectures. Therefore, in this work we explore potential means to enable use of the easy-to-program Hadoop MapReduce framework without requiring a complete infrastructure overhaul from existing HPC NAS solutions. We demonstrate that retaining function-dedicated resources like NAS is not only possible, but can even be effected efficiently with MapReduce. In our exploration, we unearth subtle pitfalls resultant from this mash-up of new-era Big Data computation on conventional HPC storage and share the clever architectural configurations that allow us to avoid them. Last, we design and present a novel Hadoop File System, the Reliable Array of Independent NAS File System (RainFS), and experimentally demonstrate its improvements in performance and reliability over the previous architectures we have investigated.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131928839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weixian Liao, Ming Li, Sergio Salinas, Pan Li, M. Pan
{"title":"Optimal Energy Cost for Strongly Stable Multi-hop Green Cellular Networks","authors":"Weixian Liao, Ming Li, Sergio Salinas, Pan Li, M. Pan","doi":"10.1109/ICDCS.2014.15","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.15","url":null,"abstract":"With the ever increasing user adoption of mobile devices like smart phones and tablets, the cellular service providers' energy consumption and cost are fast-growing and have received tremendous attention. How to effectively reduce the energy cost of cellular networks and achieve green communications while satisfying cellular users' rocketing traffic demands has become an urgent and challenging problem. In this paper, we investigate the minimization of the long-term time-averaged expected energy cost of a cellular service provider while guaranteeing the strong stability of the network. We first formulate an offline optimization problem with a joint consideration of flow routing, link scheduling, and energy (i.e., renewable energy resource, energy storage unit, etc.) constraints. Since the formulated problem is a time-coupling stochastic Mixed-Integer Non-Linear Programming (MINLP) problem, it is prohibitively expensive to solve. Then, we reformulate the problem by employing Lyapunov optimization theory. A decomposition based algorithm is developed to solve the problem, which is proved to guarantee the network strong stability. Both the lower and upper bounds on the optimal result of the original problem are derived and proven. Simulation results demonstrate that the obtained lower and upper bounds are very tight, and that the proposed scheme results in noticeable energy cost savings.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128907175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junho Suh, T. Kwon, C. Dixon, Wes Felter, J. Carter
{"title":"OpenSample: A Low-Latency, Sampling-Based Measurement Platform for Commodity SDN","authors":"Junho Suh, T. Kwon, C. Dixon, Wes Felter, J. Carter","doi":"10.1109/ICDCS.2014.31","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.31","url":null,"abstract":"In this paper we propose, implement and evaluate OpenSample: a low-latency, sampling-based network measurement platform targeted at building faster control loops for software-defined networks. OpenSample leverages sFlow packet sampling to provide near-real-time measurements of both network load and individual flows. While OpenSample is useful in any context, it is particularly useful in an SDN environment where a network controller can quickly take action based on the data it provides. Using sampling for network monitoring allows OpenSample to have a 100 millisecond control loop rather than the 1-5 second control loop of prior polling-based approaches. We implement OpenSample in the Floodlight Open Flow controller and evaluate it both in simulation and on a test bed comprised of commodity switches. When used to inform traffic engineering, OpenSample provides up to a 150% throughput improvement over both static equal-cost multi-path routing and a polling-based solution with a one second control loop.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130824159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xingliang Yuan, Xinyu Wang, Cong Wang, A. Squicciarini, K. Ren
{"title":"Enabling Privacy-Preserving Image-Centric Social Discovery","authors":"Xingliang Yuan, Xinyu Wang, Cong Wang, A. Squicciarini, K. Ren","doi":"10.1109/ICDCS.2014.28","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.28","url":null,"abstract":"The increasing popularity of images at social media sites is posing new opportunities for social discovery applications, i.e., suggesting new friends and discovering new social groups with similar interests via exploring images. To effectively handle the explosive growth of images involved in social discovery, one common trend for many emerging social media sites is to leverage the commercial public cloud as their robust backend data center. While extremely convenient, directly exposing content-rich images and the related social discovery results to the public cloud also raises new acute privacy concerns. In light of the observation, in this paper we propose a privacy-preserving social discovery service architecture based on encrypted images. As the core of such social discovery is to compare and quantify similar images, we first adopt the effective Bag-of-Words model to extract the \"visual similarity content\" of users' images into image profile vectors, and then model the problem as similarity retrieval of encrypted high-dimensional image profiles. To support fast and scalable similarity search over hundreds of thousands of encrypted images, we propose a secure and efficient indexing structure. The resulting design enables social media sites to obtain secure, practical, and accurate social discovery from the public cloud, without disclosing the encrypted image content. We formally prove the security and discuss further extensions on user image update and the compatibility with existing image sharing social functionalities. Extensive experiments on a large Flickr image dataset demonstrate the practical performance of the proposed design. Our qualitative social discovery results show consistency with human perception.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"1 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113976329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring the Use of Diverse Replicas for Big Location Tracking Data","authors":"Ye Ding, Haoyu Tan, Wuman Luo, L. Ni","doi":"10.1109/ICDCS.2014.17","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.17","url":null,"abstract":"The value of large amount of location tracking data has received wide attention in many applications including human behavior analysis, urban transportation planning, and various location-based services (LBS). Nowadays, both scientific and industrial communities are encouraged to collect as much location tracking data as possible, which brings about two issues: 1) it is challenging to process the queries on big location tracking data efficiently, and 2) it is expensive to store several exact data replicas for fault-tolerance. So far, several dedicated storage systems have been proposed to address these issues. However, they do not work well when the query ranges vary widely. In this paper, we present the design of a storage system using diverse replica scheme which improves the query processing efficiency with reduced cost of storage space. To the best of our knowledge, we are the first to investigate the data storage and processing in the context of big location tracking data. Specifically, we conduct in-depth theoretical and empirical analysis of the trade-offs between different spatio-temporal partitioning schemes as well as data encoding schemes. Then we propose an effective approach to select an appropriate set of diverse replicas, which is optimized for the expected query loads while conforming to the given storage space budget. The experiment results confirm that using diverse replicas can significantly improve the overall query performance. The results also demonstrate that the proposed algorithms for the replica selection problem is both effective and efficient.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131158697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Impact Analysis of Topology Poisoning Attacks on Economic Operation of the Smart Power Grid","authors":"M. Rahman, E. Al-Shaer, R. Kavasseri","doi":"10.1109/ICDCS.2014.72","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.72","url":null,"abstract":"The Optimal Power Flow (OPF) routine used in energy control centers allocates individual generator outputs by minimizing the overall cost of generation subject to system level operating constraints. The OPF relies on the outputs of two other modules, namely topology processor and state estimator. The topology processor maps the grid topology based on statuses received from the switches and circuit breakers across the system. The state estimator computes the system state, i.e., voltage magnitudes with phase angles, transmission line flows, and system loads based on real-time meter measurements. However, topology statuses and meter measurements are vulnerable to false data injection attacks. Recent research has shown that such cyber attacks can be launched against state estimation where adversaries can corrupt the states but still remain undetected. In this paper, we show how the stealthy topology poisoning attacks can compromise the integrity of OPF, and thus undermine economic operation. We describe a formal verification based framework to systematically analyze the impact of such attacks on OPF. The proposed framework is illustrated with an example. We also evaluate the scalability of the framework with respect to time and memory requirements.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"18 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133980325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Compiler Driven Automatic Kernel Context Migration for Heterogeneous Computing","authors":"Ramy Gad, Tim Süß, A. Brinkmann","doi":"10.1109/ICDCS.2014.47","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.47","url":null,"abstract":"Computer systems provide different heterogeneous resources (e.g., GPUs, DSPs and FPGAs) that accelerate applications and that can reduce the energy consumption by using them. Usually, these resources have an isolated memory and a require target specific code to be written. There exist tools that can automatically generate target specific codes for program parts, so-called kernels. The data objects required for a target kernel execution need to be moved to the target resource memory. It is the programmers' responsibility to serialize these data objects used in the kernel and to copy them to or from the resource's memory. Typically, the programmer writes his own serializing function or uses existing serialization libraries. Unfortunately, both approaches require code modifications, and the programmer needs knowledge of the used data structure format. There is a need for a tool that is able to automatically extract the original kernel data objects, serialize them, and migrate them to a target resource without requiring intervention from the programmer. In this paper, we present a tool collection ConSerner that automatically identifies, gathers, and serializes the context of a kernel and migrates it to a target resource's memory where a target specific kernel is executed with this data. This is all done transparently to the programmer. Complex data structures can be used without making a modification of the program code by a programmer necessary. Predefined data structures in external libraries (e.g., the STL's vector) can also be used as long as the source code of these libraries is available.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131757808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Balani, Deepak Jeswani, Dipyaman Banerjee, Akshat Verma
{"title":"Columbus: Configuration Discovery for Clouds","authors":"R. Balani, Deepak Jeswani, Dipyaman Banerjee, Akshat Verma","doi":"10.1109/ICDCS.2014.41","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.41","url":null,"abstract":"Low-cost, accurate and scalable software configuration discovery is the key to simplifying many cloud management tasks. However, the lack of standardization across software configuration techniques has prevented the development of a fully automated and application independent configuration discovery solution. In this work, we present Columbus, an application-agnostic system to automatically discover environmental configuration parameters or Points of Variability (PoV) in clustered applications with high accuracy. Columbus uses the insight that even though configuration mechanisms and files vary across different software, the PoVs are encoded using a few common patterns. It uses a novel rule framework to annotate file content with PoVs and a Bayesian network to estimate confidence for annotated PoVs. Our experiments confirm that Columbus can accurately discover configuration for a diverse set of enterprise and cloud applications. It has subsequently been integrated in three real-world systems that analyze this information for discovery of distributed application dependencies, enterprise IT migration and virtual application configuration.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114771803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elli Androulaki, Claudio Soriente, Luka Malisa, Srdjan Capkun
{"title":"Enforcing Location and Time-Based Access Control on Cloud-Stored Data","authors":"Elli Androulaki, Claudio Soriente, Luka Malisa, Srdjan Capkun","doi":"10.1109/ICDCS.2014.71","DOIUrl":"https://doi.org/10.1109/ICDCS.2014.71","url":null,"abstract":"Recent incidents of data-breaches from the cloud suggest that users should not trust the cloud provider to enforce access control on their data. We focus on mitigating trust to the cloud in scenarios where granting access to data not only considers user identities (as in conventional access policies), but also contextual information such as the user's location and time of access. Previous work in this context assumes a fully trusted cloud that is further capable of locating users. We introduce LoTAC, a novel framework that seamlessly integrates the operation of a cloud provider and a localization infrastructure to enforce location- and time-based access control to cloud-stored data. In LoTAC, the two entities operate independently and are only trusted to offer their basic services: the cloud provider is used and trusted only to reliably store data, the localization infrastructure is used and trusted only to accurately locate users. Furthermore, neither the cloud provider nor the localization infrastructure can access the data, even if they collude. LoTAC protocols require no changes to the cloud provider and minimal changes to the localization infrastructure. We evaluate our protocols using a cellular network as the localization infrastructure and show that they incur in low communication and computation costs and scale well with a large number of users and policies.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"1993 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128629318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}