{"title":"BAR: An Efficient Data Locality Driven Task Scheduling Algorithm for Cloud Computing","authors":"Jiahui Jin, Junzhou Luo, Aibo Song, Fang Dong, Runqun Xiong","doi":"10.1109/CCGrid.2011.55","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.55","url":null,"abstract":"Large scale data processing is increasingly common in cloud computing systems like MapReduce, Hadoop, and Dryad in recent years. In these systems, files are split into many small blocks and all blocks are replicated over several servers. To process files efficiently, each job is divided into many tasks and each task is allocated to a server to deals with a file block. Because network bandwidth is a scarce resource in these systems, enhancing task data locality(placing tasks on servers that contain their input blocks) is crucial for the job completion time. Although there have been many approaches on improving data locality, most of them either are greedy and ignore global optimization, or suffer from high computation complexity. To address these problems, we propose a heuristic task scheduling algorithm called Balance-Reduce(BAR), in which an initial task allocation will be produced at first, then the job completion time can be reduced gradually by tuning the initial task allocation. By taking a global view, BAR can adjust data locality dynamically according to network state and cluster workload. The simulation results show that BAR is able to deal with large problem instances in a few seconds and outperforms previous related algorithms in term of the job completion time.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132138975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supporting Federated Multi-authority Security Models","authors":"J. Watt, R. Sinnott","doi":"10.1109/CCGrid.2011.77","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.77","url":null,"abstract":"The JISC-funded Shintau project has produced an extension to the Shibboleth profile which allows a user to link information from more than one IdP together utilising a custom Linking Service (LS). This paper describes both the application and independent evaluation of this software by the Nationale-Science Centre (NeSC) at the University of Glasgow within the context of the ESRC-funded Data Management throug he-Social Science (DAMES) project.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125238457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huaiming Song, Yanlong Yin, Xian-He Sun, R. Thakur, S. Lang
{"title":"A Segment-Level Adaptive Data Layout Scheme for Improved Load Balance in Parallel File Systems","authors":"Huaiming Song, Yanlong Yin, Xian-He Sun, R. Thakur, S. Lang","doi":"10.1109/CCGrid.2011.26","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.26","url":null,"abstract":"Parallel file systems are designed to mask the ever-increasing gap between CPU and disk speeds via parallel I/O processing. While they have become an indispensable component of modern high-end computing systems, their inadequate performance is a critical issue facing the HPC community today. Conventionally, a parallel file system stripes a file across multiple file servers with a fixed stripe size. The stripe size is a vital performance parameter, but the optimal value for it is often application dependent. How to determine the optimal stripe size is a difficult research problem. Based on the observation that many applications have different data-access clusters in one file, with each cluster having a distinguished data access pattern, we propose in this paper a segmented data layout scheme for parallel file systems. The basic idea behind the segmented approach is to divide a file logically into segments such that an optimal stripe size can be identified for each segment. A five-step method is introduced to conduct the segmentation, to identify the appropriate stripe size for each segment, and to carry out the segmented data layout scheme automatically. Experimental results show that the proposed layout scheme is feasible and effective, and it improves performance up to 163% for writing and 132% for reading on the widely used IOR and IOzone benchmarks.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"139 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127497304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. N. Dinh, D. Abramson, Donny Kurniawan, Chao Jin, Bob Moench, L. D. Rose
{"title":"Assertion Based Parallel Debugging","authors":"M. N. Dinh, D. Abramson, Donny Kurniawan, Chao Jin, Bob Moench, L. D. Rose","doi":"10.1109/CCGrid.2011.44","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.44","url":null,"abstract":"Programming languages have advanced tremendously over the years, but program debuggers have hardly changed. Sequential debuggers do little more than allow a user to control the flow of a program and examine its state. Parallel ones support the same operations on multiple processes, which are adequate with a small number of processors, but become unwieldy and ineffective on very large machines. Typical scientific codes have enormous multi-dimensional data structures and it is impractical to expect a user to view the data using traditional display techniques. In this paper we discuss the use of debug-time assertions, and show that these can be used to debug parallel programs. The techniques reduce the debugging complexity because they reason about the state of large arrays without requiring the user to know the expected value of every element. Assertions can be expensive to evaluate, but their performance can be improved by running them in parallel. We demonstrate the system with a case study finding errors in a parallel version of the Shallow Water Equations, and evaluate the performance of the tool on a 4,096 cores Cray XE6.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116031266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple Services Throughput Optimization in a Hierarchical Middleware","authors":"E. Caron, B. Depardon, F. Desprez","doi":"10.1109/CCGrid.2011.20","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.20","url":null,"abstract":"Accessing the power of distributed resources can nowadays easily be done using a middleware based on a client/server approach. Several architectures exist for those middleware's. The most scalable ones rely on a hierarchical design. Determining the best shape for the hierarchy, the one giving the best throughput of services, is not an easy task. We first propose a computation and communication model for such hierarchical middleware. Our model takes into account the deployment of several services in the hierarchy. Then, based on this model, we propose algorithms for automatically constructing a hierarchy on two kinds of heterogeneous platforms: communication homogeneous/computation heterogeneous platforms, and fully heterogeneous platforms. The proposed algorithms aim at offering the users the best obtained to requested throughput ratio, while providing fairness on this ratio for the different kinds of services, and using as few resources as possible for the hierarchy. For each kind of platforms, we compare our model with experimental results on a real middleware called DIET (Distributed Interactive Engineering Toolbox).","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126526608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Grid Global Behavior Prediction","authors":"Jesús Montes, Alberto Sánchez, María S. Pérez","doi":"10.1109/CCGrid.2011.17","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.17","url":null,"abstract":"Complexity has always been one of the most important issues in distributed computing. From the first clusters to grid and now cloud computing, dealing correctly and efficiently with system complexity is the key to taking technology a step further. In this sense, global behavior modeling is an innovative methodology aimed at understanding the grid behavior. The main objective of this methodology is to synthesize the grid's vast, heterogeneous nature into a simple but powerful behavior model, represented in the form of a single, abstract entity, with a global state. Global behavior modeling has proved to be very useful in effectively managing grid complexity but, in many cases, deeper knowledge is needed. It generates a descriptive model that could be greatly improved if extended not only to explain behavior, but also to predict it. In this paper we present a prediction methodology whose objective is to define the techniques needed to create global behavior prediction models for grid systems. This global behavior prediction can benefit grid management, specially in areas such as fault tolerance or job scheduling. The paper presents experimental results obtained in real scenarios in order to validate this approach.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117317678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detection and Protection against Distributed Denial of Service Attacks in Accountable Grid Computing Systems","authors":"Wonjun Lee, A. Squicciarini, E. Bertino","doi":"10.1109/CCGrid.2011.28","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.28","url":null,"abstract":"By exploiting existing vulnerabilities, malicious parties can take advantage of resources made available by grid systems to attack mission-critical websites or the grid itself. In this paper, we present two approaches for protecting against attacks targeting sites outside or inside the grid. Our approach is based on special-purpose software agents that collect provenance and resource usage data in order to perform detection and protection. We show the effectiveness and the efficiency of our approach by conducting various experiments on an emulated grid test-bed.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"172 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116685549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eugenio Cesario, Antonio Grillo, C. Mastroianni, D. Talia
{"title":"A Sketch-Based Architecture for Mining Frequent Items and Itemsets from Distributed Data Streams","authors":"Eugenio Cesario, Antonio Grillo, C. Mastroianni, D. Talia","doi":"10.1109/CCGrid.2011.45","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.45","url":null,"abstract":"This paper presents the design and the implementation of an architecture for the analysis of data streams in distributed environments. In particular, data stream analysis has been carried out for the computation of items and item sets that exceed a frequency threshold. The mining approach is hybrid, that is, frequent items are calculated with a single pass, using a sketch algorithm, while frequent item sets are calculated by a further multi-pass analysis. The architecture combines parallel and distributed processing to keep the pace with the rate of distributed data streams. In order to keep computation close to data, miners are distributed among the domains where data streams are generated. The paper also reports the experimental results obtained with a prototype of the architecture, tested on a Grid composed of two domains handling two different data streams.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"373 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126716702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"APP: Minimizing Interference Using Aggressive Pipelined Prefetching in Multi-level Buffer Caches","authors":"C. Patrick, Nicholas Voshell, M. Kandemir","doi":"10.1109/CCGrid.2011.47","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.47","url":null,"abstract":"As services become more complex with multiple interactions, and storage servers are shared by multiple services, the different I/O streams arising from these multiple services compete for disk attention. Aggressive Pipelined Prefetching (APP) enabled storage clients are designed to manage the buffer cache and I/O streams to minimize the disk I/O-interference arising from competing streams. Due to the large number of streams serviced by a storage server, most of the disk time is spent seeking, leading to degradation in response times. The goal of APP is to decrease application execution time by increasing the throughput of individual I/O streams and utilizing idle capacity on remote nodes along with idle network times thus effectively avoiding alternating bursts of activity followed by periods of inactivity. APP significantly increases overall I/O throughput and decreases overall messaging overhead between servers. In APP, the intelligence is embedded in the clients and they automatically infer parameters in order to achieve the maximum throughput. APP clients make use of aggressive prefetching and data offloading to remote buffer caches in multi-level buffer cache hierarchies in an effort to minimize disk interference and tranquilize the effects of aggressive prefetching. We used an extremely I/O-intensive Radix-k application employed in studies on the scalability of parallel image composition and particle tracing developed at the Argonne National Laboratory with data sets of up to 128GB and implemented our scheme on a 16-node Linux cluster. We observed that the execution time of the application decreased by 68% on average when using our scheme.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130221627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementing Trust in Cloud Infrastructures","authors":"R. Neisse, Dominik Holling, A. Pretschner","doi":"10.1109/CCGrid.2011.35","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.35","url":null,"abstract":"Today's cloud computing infrastructures usually require customers who transfer data into the cloud to trust the providers of the cloud infrastructure. Not every customer is willing to grant this trust without justification. It should be possible to detect that at least the configuration of the cloud infrastructure -- as provided in the form of a hyper visor and administrative domain software -- has not been changed without the customer's consent. We present a system that enables periodical and necessity-driven integrity measurements and remote attestations of vital parts of cloud computing infrastructures. Building on the analysis of several relevant attack scenarios, our system is implemented on top of the Xen Cloud Platform and makes use of trusted computing technology to provide security guarantees. We evaluate both security and performance of this system. We show how our system attests the integrity of a cloud infrastructure and detects all changes performed by system administrators in a typical software configuration, even in the presence of a simulated denial-of-service attack.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122215595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}