{"title":"A Highly Scalable Decentralized Scheduler of Tasks with Deadlines","authors":"Javier Celaya, U. Arronategui","doi":"10.1109/Grid.2011.17","DOIUrl":"https://doi.org/10.1109/Grid.2011.17","url":null,"abstract":"Scheduling of tasks in distributed environments, like cloud and grid computing platforms, using deadlines to provide quality of service is a challenging problem. The few existing proposals suffer from scalability limitations, because they try to manage full knowledge of the system state. To our knowledge, there is no implementation yet that reaches scales of a hundred thousand nodes. In this paper, we present a fully decentralized scheduler, that aggregates information about the availability of the execution nodes throughout the network and uses it to allocate tasks to those nodes that are able to finish them in time. Through simulation, we show that our scheduler is able to operate on different scenarios, from many-task applications in cloud computing sites to volunteer computing projects. Simulations on networks of up to a hundred thousand nodes show very competitive performance, reaching allocation times of under a second and very low overhead in low latency gigabit networks.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127686459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luis Tomás, A. Caminero, María Blanca Caminero, C. Carrión
{"title":"A Strategy to Improve Resource Utilization in Grids Based on Network-Aware Meta-scheduling in Advance","authors":"Luis Tomás, A. Caminero, María Blanca Caminero, C. Carrión","doi":"10.1109/Grid.2011.16","DOIUrl":"https://doi.org/10.1109/Grid.2011.16","url":null,"abstract":"The provision of Quality of Service (QoS) in Grids (systems made of heterogeneous computing resources geographically dispersed) is still a challenging task that needs the attention of the research community. Since reservations of resources may not always be possible, another possible way of enhancing the QoS perceived by Grid users is by performing meta-scheduling of jobs in advance, where jobs are scheduled some time before they are actually executed. Hence, it becomes more likely that the appropriate resources are available to run the job whenever needed. One of the drawbacks of this scenario is that fragmentation appears as a well known effect in job allocations into resources. Fragmentation also becomes the cause for poor resource utilization. For these reasons, a new technique has been developed to tackle fragmentation problems, which consists of rescheduling already scheduled tasks. To this end, some heuristics have been implemented to figure out which intervals need replanning and to select the jobs which are involved in that rescheduling process. On top of that, another heuristic has been implemented to put rescheduled jobs as close together as possible so that fragmentation is avoided or reduced to the minimum. This technique has been tested using a real test bed involving heterogeneous computing resources from different organizations. An evaluation is presented that illustrates the efficiency of this approach to meet the users' QoS requirements.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126494900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adjustable Module Isolation for Distributed Computing Infrastructures","authors":"S. Schulz, W. Blochinger","doi":"10.1109/Grid.2011.22","DOIUrl":"https://doi.org/10.1109/Grid.2011.22","url":null,"abstract":"Cloud Computing infrastructures and Grid Computing platforms are representatives of a new breed of systems that leverage the modularity paradigm to assemble large-scale dynamic applications from modules contributed by different, possibly untrustworthy providers. Increased susceptibility to faults, diminished accountability, and complex system configuration are major challenges when assembling and operating such systems. In this paper, we describe how to solve these problems by retrofitting module management systems with the ability to deploy modules to execution environments with adjustable degree of isolation. We give a formal definition of the underlying hierarchical Module Isolation Problem and devise an online algorithm to solve it in an incremental fashion. We discuss how to apply our approach to a state-of-the-art module management system and demonstrate its effectiveness by an experimental evaluation.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132743299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
W. Lloyd, S. Pallickara, O. David, J. Lyon, M. Arabi, K. Rojas
{"title":"Migration of Multi-tier Applications to Infrastructure-as-a-Service Clouds: An Investigation Using Kernel-Based Virtual Machines","authors":"W. Lloyd, S. Pallickara, O. David, J. Lyon, M. Arabi, K. Rojas","doi":"10.1109/Grid.2011.26","DOIUrl":"https://doi.org/10.1109/Grid.2011.26","url":null,"abstract":"To investigate challenges of multi-tier application migration to Infrastructure-as-a-Service (IaaS) clouds we performed an experimental investigation by deploying a processor bound and input-output bound variant of the RUSLE2 erosion model to an IaaS based private cloud. Scaling the applications to achieve optimal system throughput is complex and involves much more than simply increasing the number of allotted virtual machines (VMs). While scaling the application variants a series of bottlenecks were encountered unique to an application's processing, I/O, and memory requirements, herein referred to as an application's profile. To investigate the impact of provisioning variation for hosting multi-tier applications we tested four schemes of VM deployments across the physical nodes of our cloud. Performance degradation was more pronounced when multiple I/O or CPU resource intensive application components were co-located on the same physical hardware. We investigated the virtualization overhead incurred using Kernel-based virtual machines (KVM) by deploying our application variants to both physical and virtual machines. Overhead varied based on the unique characteristics of each application's profile. We observed ~112% overhead for the input/output bound application and just ~ 10% overhead for the processor bound application. Understanding an application's profile was found to be important for optimal IaaS-based cloud migration and scaling.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134589682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zacharia Fadika, Elif Dede, M. Govindaraju, L. Ramakrishnan
{"title":"Benchmarking MapReduce Implementations for Application Usage Scenarios","authors":"Zacharia Fadika, Elif Dede, M. Govindaraju, L. Ramakrishnan","doi":"10.1109/Grid.2011.21","DOIUrl":"https://doi.org/10.1109/Grid.2011.21","url":null,"abstract":"The MapReduce paradigm provides a scalable model for large scale data-intensive computing and associated fault-tolerance. With data production increasing daily due to ever growing application needs, scientific endeavors, and consumption, the MapReduce model and its implementations need to be further evaluated, improved, and strengthened. Several MapReduce frameworks with various degrees of conformance to the key tenets of the model are available today, each, optimized for specific features. HPC application and middleware developers must thus understand the complex dependencies between MapReduce features and their application. We present a standard benchmark suite for quantifying, comparing, and contrasting the performance of MapReduce platforms under a wide range of representative use cases. We report the performance of three different MapReduce implementations on the benchmarks, and draw conclusions about their current performance characteristics. The three platforms we chose for evaluation are the widely used Apache Hadoop implementation, Twister, which has been discussed in the literature, and LEMO-MR, our own implementation. The performance analysis we perform also throws light on the available design decisions for future implementations, and allows Grid researchers to choose the MapReduce implementation that best suits their application's needs.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130384096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supporting Deadline Constrained Distributed Computations on Grids","authors":"Xinghui Zhao, Nadeem Jamali","doi":"10.1109/Grid.2011.29","DOIUrl":"https://doi.org/10.1109/Grid.2011.29","url":null,"abstract":"The growing popularity of grid and cloud computing has led to a renewed interest in resource control and coordination. The Actor model, which encapsulates objects along with threads of execution, offers a convenient way for scheduling computations' access to resources by way of scheduling of the actor threads. However, efficient Actor implementations do not use a thread for each actor, making implementation of fine-grained resource scheduling decisions difficult. This paper presents our work on integrating mechanisms for deadline assurance into an optimized implementation of Actors. We achieve this by using deadline-driven adaptive scheduling, which prioritizes individual message deliveries and method executions involved in a distributed computation, based on the calculated deadlines by which each must be completed. These deadlines can be efficiently calculated at run-time for an important class of computations which use pipeline interaction style. Additionally, a tuner dynamically balances -- manually or automatically -- the overhead of the control mechanisms against the extent of control exercised. Experimental evaluation shows that the approach offers effective support for timeliness requirements (for multimedia QoS, for example) at the cost of a relatively modest and adjustable overhead.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128869246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Beyond Batch Computing on the WLCG Grid","authors":"M. Meoni","doi":"10.1109/Grid.2011.34","DOIUrl":"https://doi.org/10.1109/Grid.2011.34","url":null,"abstract":"The traditional use of Grid computing consists in submitting batch jobs and waiting for results to be produced, without any prediction of the time at which a job will be effectively started at a selected site. This paper aims at widening the use of Grid technology for enabling a private virtual cluster on the WLCG Grid and running one's favorite software. We term such a system iGrid, interactive Grid, since a user can then \"interact\" with Grid nodes right away. In addition, the Fairy CPU fairs hare mechanism allows users to get their fair iGrid machine share over a long period. We highlight important principles for introducing interactive capability to the Grid. We present a functioning prototype of iGrid running PROOF - a software for data intensive parallel computing - on top of it. Our prototype implementation scales to all 400 nodes we were granted. The experiment shows the performance of PROOF on iGrid sites and challenges many aspect of dynamic resource allocation and remote data processing. We identify bottlenecks that inhibit interactivity and outline opportunities and limitations for running interactive Grid-distributed HEP applications on the Grid.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130369108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ligang He, Deqing Zou, Zhang Zhang, Kai Yang, Hai Jin, S. Jarvis
{"title":"Optimizing Resource Consumptions in Clouds","authors":"Ligang He, Deqing Zou, Zhang Zhang, Kai Yang, Hai Jin, S. Jarvis","doi":"10.1109/Grid.2011.15","DOIUrl":"https://doi.org/10.1109/Grid.2011.15","url":null,"abstract":"This paper considers the scenario where multiple clusters of Virtual Machines (i.e., termed as Virtual Clusters) are hosted in a Cloud system consisting of a cluster of physical nodes. Multiple Virtual Clusters (VCs) cohabit in the physical cluster, with each VC offering a particular type of service for the incoming requests. In this context, VM consolidation, which strives to use a minimal number of nodes to accommodate all VMs in the system, plays an important role in saving resource consumption. Most existing consolidation methods proposed in the literature regard VMs as \"rigid\" during consolidation, i.e., VMs' resource capacities remain unchanged. In VC environments, QoS is usually delivered by a VC as a single entity. Therefore, there is no reason why VMs' resource capacity cannot be adjusted as long as the whole VC is still able to maintain the desired QoS. Treating VMs as being \"mouldable\" during consolidation may be able to further consolidate VMs into an even fewer number of nodes. This paper investigates this issue and develops a Genetic Algorithm (GA) to consolidate mouldable VMs. The GA is able to evolve an optimized system state, which represents the VM-to-node mapping and the resource capacity allocated to each VM. After the new system state is calculated by the GA, the Cloud will transit from the current system state to the new one. The transition time represents overhead and should be minimized. In this paper, a cost model is formalized to capture the transition overhead, and a reconfiguration algorithm is developed to transit the Cloud to the optimized system state at the low transition overhead. Experiments have been conducted in this paper to evaluate the performance of the GA and the reconfiguration algorithm.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131333551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Elastic Application Container","authors":"Sijin He, Li Guo, Yike Guo","doi":"10.1109/GRID.2011.35","DOIUrl":"https://doi.org/10.1109/GRID.2011.35","url":null,"abstract":"The computing resource level architecture allows end-users to directly control its underlying computer resources, such as VM (virtual machine) operations, scaling, networking, etc. However, setting up and maintaining a working environment is complex and time consuming for end-users and resource management is also a heavy-weight task for the providers. In contrast, the application resource level architecture automatically controls its underlying computer resources so that end-users can concentrate on their core business. In this paper, we propose a new architecture called Elastic Application Container (EAC) that enables the end-users to efficiently develop and deliver light-weight, elastic, multi-tenant, and portable applications. The EAC is an abstract representation which hides all its abstractions of the underlying VMs. We believe that our EAC architecture has the potential to become the foundation of future application resource level model in this research area.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129120061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mediation of Service Overhead in Service-Oriented Grid Architectures","authors":"Per-Olov Östberg, E. Elmroth","doi":"10.1109/Grid.2011.11","DOIUrl":"https://doi.org/10.1109/Grid.2011.11","url":null,"abstract":"Grid computing applications and infrastructures build heavily on Service-Oriented Computing development methodology and are often realized as Service-Oriented Architectures. The Grid Job Management Framework (GJMF) is a flexible Grid infrastructure and application support tool that offers a range of abstractive and platform independent interfaces for middleware-agnostic Grid job submission, monitoring, and control. In this paper we use the GJMF as a test bed for characterization of Grid Service-Oriented Architecture overhead, and evaluate the efficiency of a set of design patterns for overhead mediation mechanisms featured in the framework.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129151901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}