{"title":"Using the Gfarm File System as a POSIX Compatible Storage Platform for Hadoop MapReduce Applications","authors":"S. Mikami, Kazuki Ohta, O. Tatebe","doi":"10.1109/Grid.2011.31","DOIUrl":"https://doi.org/10.1109/Grid.2011.31","url":null,"abstract":"MapReduce is a promising parallel programming model for processing large data sets. Hadoop is an up-and-coming open-source implementation of MapReduce. It uses the Hadoop Distributed File System (HDFS) to store input and output data. Due to a lack of POSIX compatibility, it is difficult for existing software to directly access data stored in HDFS. Therefore, it is not possible to share storage between existing software and MapReduce applications. In order for external applications to process data using MapReduce, we must first import the data, process it, then export the output data into a POSIX compatible file system. This results in a large number of redundant file operations. In order to solve this problem we propose using Gfarm file system instead of HDFS. Gfarm is a POSIX compatible distributed file system and has similar architecture to HDFS. We design and implement of Hadoop-Gfarm plug-in which enables Hadoop MapReduce to access files on Gfarm efficiently. We compared the MapReduce workload performance of HDFS, Gfarm, PVFS and Gluster FS, which are open-source distributed file systems. Our various evaluations show that Gfarm performed just as well as Hadoop's native HDFS. In most evaluations, Gfarm performed bettar than twice as well as PVFS and Gluster FS.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"1108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116058366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A WS-Agreement-Based QoS Auditor Negotiation Mechanism for Grids","authors":"Alisson Andrade, A. Melo","doi":"10.1109/Grid.2011.10","DOIUrl":"https://doi.org/10.1109/Grid.2011.10","url":null,"abstract":"High performance platforms composed of commodity computing resources, such as grids and peer-to-peer systems, have greatly evolved and assumed an important role in the last decade. Nevertheless, their wide commercial use still depends on the establishment of an effective quality of service (QoS) infrastructure in those environments. For this reason, a variety of proposals have recently emerged in which consumer and provider monitor and control grid resources in order to guarantee previously established service level agreements. However, in many cases there is lack of trust between provider and consumer in relation to monitoring those agreements. In such cases, it becomes necessary to introduce a third entity - an impartial and trustworthy QoS auditor - in order to solve conflicts of interest. Though, as there may be several auditors trusted by provider and consumer, we claim that the QoS auditor needs to be negotiated and established just as the service level agreement is negotiated by the parties. In order to support this issue, the present paper proposes and evaluates a negotiation mechanism for QoS auditors in computational grids. Some of the proposed mechanism's characteristics are low intrusiveness and use of open standards, such as the WS-Agreement. Experimental analysis on a prototype of the proposed negotiation mechanism have shown that the auditor negotiation process took less than a minute to finish, which is far less than the service execution time in most grid computing use cases.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124679498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Replicated Grid Resources","authors":"Sal Valente, A. Grimshaw","doi":"10.1109/Grid.2011.33","DOIUrl":"https://doi.org/10.1109/Grid.2011.33","url":null,"abstract":"We have added support for replication of stateful resources in a Web services based grid platform. Replication allows resources to be highly available for both reading and writing. The contributions of this work are algorithms for update propagation, conflict detection, and conflict resolution for generic resources in a decentralized environment. In order to show that these generic algorithms can be applied to specific resource types, we present a Web services based distributed file system with automatic replication and automatic fail over. We show that this system can read and write files and directories, with no loss of data, during a server failure.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114657887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Victor Hazlewood, P. Kovatch, M. Ezell, Matthew Johnson, P. Redd
{"title":"Improved Grid Security Posture through Multi-factor Authentication","authors":"Victor Hazlewood, P. Kovatch, M. Ezell, Matthew Johnson, P. Redd","doi":"10.1109/Grid.2011.41","DOIUrl":"https://doi.org/10.1109/Grid.2011.41","url":null,"abstract":"While methods of securing communication over the Internet have changed from clear text to secure encrypted channels over the last decade, the basic username-password combination for authentication has remained the mainstay in academic research computing and grid environments. Security incidents affecting grids, such as the TeraGrid stakkato incident of 2004 and 2005, has demonstrated that the use of reusable passwords for authentication can be readily exploited and can lead to a widespread security incident across the grid [1,2]. The University of Tennessee's National Institute for Computational Sciences (NICS) founded in 2008 has provided resources to the TeraGrid, including Kraken, a 1.17 petaflops Cray XT5, and has implemented and promoted the use of multi-factor authentication mechanisms since its founding. The benefits of use of this stronger authentication method has been higher productivity and resource availability for users due to no known user account compromises caused by stolen NICS user credentials that led to disabling accounts or system resources. NICS has been developing and experimenting with expanding our use of multi-factor authentication to the grid. NICS has integrated multi-factor authentication with our certificate authority so that users can now run my proxy and receive a multi-factor authenticated certificate. NICS is also exploring the federation of multi-factor authentication systems, with the goal of \"one user, one token\". This is especially important, as new grid resources, such as Blue Waters, will only allow multi-factor authentication, and we want the users to only carry one token, not many tokens. XSEDE, the TeraGrid successor, will also be deploying multi-factor authentication in addition to the other existing authentication methodologies. XSEDE will also work closely with science gateways and workflows to develop and maintain secure frameworks for the highest level of security possible.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122248049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Differentiated Availability in Cloud Computing SLAs","authors":"A. Undheim, Ameen Chilwan, P. Heegaard","doi":"10.1109/Grid.2011.25","DOIUrl":"https://doi.org/10.1109/Grid.2011.25","url":null,"abstract":"Cloud computing is the new trend in service delivery, and promises large cost savings and agility for the customers. However, some challenges still remain to be solved before widespread use can be seen. This is especially relevant for enterprises, which currently lack the necessary assurance for moving their critical data and applications to the cloud. The cloud SLAs are simply not good enough. This paper focuses on the availability attribute of a cloud SLA, and develops a complete model for cloud data centers, including the network. Different techniques for increasing the availability in a virtualized system are investigated, quantifying the resulting availability. The results show that depending on the failure rates, different deployment scenarios and fault-tolerance techniques can be used for achieving availability differentiation. However, large differences can be seen from using different priority levels for restarting of virtual machines.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127824394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph-Cut Based Coscheduling Strategy Towards Efficient Execution of Scientific Workflows in Collaborative Cloud Environments","authors":"Kefeng Deng, Junqiang Song, Kaijun Ren, Dong Yuan, Jinjun Chen","doi":"10.1109/Grid.2011.14","DOIUrl":"https://doi.org/10.1109/Grid.2011.14","url":null,"abstract":"Recently, cloud computing has emerged as a promising computing infrastructure for performing scientific workflows by providing on-demand resources. Meanwhile, it is convenient for scientific collaboration since different cloud environments used by the researchers are connected through Internet. However, the significant latency arising from frequent access to large datasets and the corresponding data movements across geo-distributed data centers has been an obstacle to hinder the efficient execution of data-intensive scientific workflows. In this paper, we propose a novel graph-cut based data and task co scheduling strategy for minimizing the data transfer across geo-distributed data centers. Specifically, a dependency graph is firstly constructed from workflow provenance and cut into sub graphs according to the datasets which must appear in fixed data centers by a multiway cut algorithm. Then, the sub graphs might be recursively cut into smaller ones by a minimum cut algorithm referring to data correlation rules until all of them can well fit the capacity constraints of the data centers where the fixed location datasets reside. In this way, the datasets and tasks are distributed into target data centers while the total amount of data transfer between them is minimized. Additionally, a runtime scheduling algorithm is exploited to dynamically adjust the data placement during execution to prevent the data centers from overloading. Simulation results demonstrate that the total volume of data transfer across different data centers can be significantly reduced and the cost of performing scientific workflows on the clouds will be accordingly saved.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133586671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oliver Niehörster, Alexander Krieger, J. Simon, A. Brinkmann
{"title":"Autonomic Resource Management with Support Vector Machines","authors":"Oliver Niehörster, Alexander Krieger, J. Simon, A. Brinkmann","doi":"10.1109/Grid.2011.28","DOIUrl":"https://doi.org/10.1109/Grid.2011.28","url":null,"abstract":"The use of virtualization technology makes data centers more dynamic and easier to administrate. Today, cloud providers offer customers access to complex applications running on virtualized hardware. Nevertheless, big virtualized data centers become stochastic environments and the implification on the user side leads to many challenges for the provider. He has to find cost-efficient configurations and has to deal with dynamic environments to ensure service guarantees. In this paper, we introduce a software solution that reduces the degree of human intervention to manage cloud services. We present a multi-agent system located in the Software as a Service (SaaS) layer. Agents allocate resources, configure applications, check the feasibility of requests, and generate cost estimates. The agents learn behavior models of the services via Support Vector Machines (SVMs) and share their experiences via a global knowledge base. We evaluate our approach on real cloud systems with three different applications, a brokerage system, a high-performance computing software, and a web server.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133602628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mutual Job Submission Architecture That Considered Workload Balance Among Computing Resources in the Grid Interoperation","authors":"K. Saga, K. Aida, K. Miura","doi":"10.1109/Grid.2011.12","DOIUrl":"https://doi.org/10.1109/Grid.2011.12","url":null,"abstract":"Computing resource federation among collaborators is necessary for smooth promotion of collaborations. However, this is difficult for the collaborators who are using different type grid infrastructures, because of incompatibilities of the grid middleware. Therefore an inter grid job submission specification named HPC Basic Profile (HPCBP) has been defined by the Open Grid Forum (OGF) and many grid projects have implemented it. However, there still are many problems in the grid interoperation using the HPCBP. One of them is the workload disruption problem. The interoperation architecture, which is popular in the implementation of many prototypes, has a race condition between detection of the job submission from another grid and resource allocation for a submitted job from local client. This race condition disrupts the workload balance among the computing resources, and increases number of waiting jobs. In this paper, we explain and analyze the workload problem by an experiment and a simulation, and propose an architecture which can solve the problem, and show the effectiveness of the architecture by a simulation.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128642866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Scheduling on Power-Aware Managed Data-Centers Using Machine Learning","authors":"J. L. Berral, Ricard Gavaldà, J. Torres","doi":"10.1109/GRID.2011.18","DOIUrl":"https://doi.org/10.1109/GRID.2011.18","url":null,"abstract":"Energy-related costs have become one of the major economic factors in IT data-centers, and companies and the research community are currently working on new efficient power-aware resource management strategies, also known as \"Green IT\". Here we propose a framework for autonomic scheduling of tasks and web-services on cloud environments, optimizing the profit taking into account revenue for task execution minus penalties for service-level agreement violations, minus power consumption cost. The principal contribution is the combination of consolidation and virtualization technologies, mathematical optimization methods, and machine learning techniques. The data-center infrastructure, tasks to execute, and desired profit are casted as a mathematical programming model, which can then be solved in different ways to find good task scheduling. We use an exact solver based on mixed linear programming as a proof of concept but, since it is an NP-complete problem, we show that approximate solvers provide valid alternatives for finding approximately optimal schedules. The machine learning is used to estimate the initially unknown parameters of the mathematical model. In particular, we need to predict a priori resource usage (such as CPU consumption) by different tasks under current workloads, and estimate task service-level-agreement (such as response time) given workload features, host characteristics, and contention among tasks in the same host. Experiments show that machine learning algorithms can predict system behavior with acceptable accuracy, and that their combination with the exact or approximate schedulers manages to allocate tasks to hosts striking a balance between revenue for executed tasks, quality of service, and power consumption.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122451807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Richard Neill, L. Carloni, Alexander Shabarshin, Valeriy Sigaev, Serguei Tcherepanov
{"title":"Embedded Processor Virtualization for Broadband Grid Computing","authors":"Richard Neill, L. Carloni, Alexander Shabarshin, Valeriy Sigaev, Serguei Tcherepanov","doi":"10.1109/Grid.2011.27","DOIUrl":"https://doi.org/10.1109/Grid.2011.27","url":null,"abstract":"We implemented and evaluated a heterogeneous system architecture that combines a traditional computer cluster with a broadband network of embedded set-top box (STB) devices to provide a distributed computing platform for parallel applications. Our prototype system for broadband grid computing leverages the recent dramatic progress in computational power of STBs. It includes a complete head-end cable system based on the Tru2way standard, a DOCSIS-2.0 network, and an implementation of the Open MPI library running on the STB embedded operating system across 128 devices. An important contribution of our work is a novel method for the virtualization of a large collection of embedded processors within a managed broadband network. This enables the embedded processors to transparently inter-operate with servers in the computer cluster using the message-passing model. To evaluate the interoperability, performance, and scalability of our system we completed a set of experiments with the standard IMB MPI benchmark suite as well as two real parallel applications. The experimental results confirm that there is an important convergence trend between traditional computing and embedded computing and that a broadband network of embedded processors is a promising new platform for a variety of computationally-intensive and data-intensive grid applications.","PeriodicalId":308086,"journal":{"name":"2011 IEEE/ACM 12th International Conference on Grid Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130371627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}