{"title":"Embarrassingly parallel jobs are not embarrassingly easy to schedule on the grid","authors":"E. Afgan, P. Bangalore","doi":"10.1109/MTAGS.2008.4777910","DOIUrl":"https://doi.org/10.1109/MTAGS.2008.4777910","url":null,"abstract":"Embarrassingly parallel applications represent an important workload in today's grid environments. Scheduling and execution of this class of applications is considered mostly a trivial and well-understood process on homogeneous clusters. However, while grid environments provide the necessary computational resources, associated resource heterogeneity represents a new challenge for efficient task execution for these types of applications across multiple resources. This paper presents a set of examples illustrating how execution characteristics of individual tasks, and consequently a job, are affected by the choice of task execution resources, task invocation parameters, and task input data attributes. It is the aim of this work to highlight this relationship between an application and an execution resource to promote development of better metascheduling techniques for the grid. By exploiting this relationship, application throughput can be maximized, also resulting in higher resource utilization. In order to achieve such benefits, a set of job scheduling and execution concerns is derived leading toward a computational pipeline for scheduling embarrassingly parallel applications in grid environments.","PeriodicalId":278412,"journal":{"name":"2008 Workshop on Many-Task Computing on Grids and Supercomputers","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127896063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhao Zhang, Allan Espinosa, K. Iskra, I. Raicu, Ian T Foster, M. Wilde
{"title":"Design and evaluation of a collective IO model for loosely coupled petascale programming","authors":"Zhao Zhang, Allan Espinosa, K. Iskra, I. Raicu, Ian T Foster, M. Wilde","doi":"10.1109/MTAGS.2008.4777908","DOIUrl":"https://doi.org/10.1109/MTAGS.2008.4777908","url":null,"abstract":"Loosely coupled programming is a powerful paradigm for rapidly creating higher-level applications from scientific programs on petascale systems, typically using scripting languages. This paradigm is a form of many-task computing (MTC) which focuses on the passing of data between programs as ordinary files rather than messages. While it has the significant benefits of decoupling producer and consumer and allowing existing application programs to be executed in parallel with no recoding, its typical implementation using shared file systems places a high performance burden on the overall system and on the user who will analyze and consume the downstream data. Previous efforts have achieved great speedups with loosely coupled programs, but have done so with careful manual tuning of all shared file system access. In this work, we evaluate a prototype collective IO model for file-based MTC. The model enables efficient and easy distribution of input data files to computing nodes and gathering of output results from them. It eliminates the need for such manual tuning and makes the programming of large-scale clusters using a loosely coupled model easier. Our approach, inspired by in-memory approaches to collective operations for parallel programming, builds on fast local file systems to provide high-speed local file caches for parallel scripts, uses a broadcast approach to handle distribution of common input data, and uses efficient scatter/gather and caching techniques for input and output. We describe the design of the prototype model, its implementation on the Blue Gene/P supercomputer, and present preliminary measurements of its performance on synthetic benchmarks and on a large-scale molecular dynamics application.","PeriodicalId":278412,"journal":{"name":"2008 Workshop on Many-Task Computing on Grids and Supercomputers","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126692234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"System support for many task computing","authors":"E. V. Hensbergen, R. Minnich","doi":"10.1109/MTAGS.2008.4777907","DOIUrl":"https://doi.org/10.1109/MTAGS.2008.4777907","url":null,"abstract":"The popularity of large scale systems such as Blue Gene has extended their reach beyond HPC into the realm of commercial computing. There is a desire in both communities to broaden the scope of these machines from tightly-coupled scientific applications running on MPI frameworks to more general-purpose workloads. Our approach deals with issues of scale by leveraging the huge number of nodes to distribute operating systems services and components across the machine, tightly coupling the operating system and the interconnects to take maximum advantage of the unique capabilities of the HPC system. We plan on provisioning nodes to provide workload execution, aggregation, and system services, and dynamically re-provisioning nodes as necessary to accommodate changes, failure, and redundancy. By incorporating aggregation as a first-class system construct, we will provide dynamic hierarchical organization and management of all system resources. In this paper, we will go into the design principles of our approach using file systems, workload distribution and system monitoring as illustrative examples. Our end goal is to provide a cohesive distributed system which can broaden the class of applications for large scale systems and also make them more approachable for a larger class of developers and end users.","PeriodicalId":278412,"journal":{"name":"2008 Workshop on Many-Task Computing on Grids and Supercomputers","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131793006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A lightweight execution framework for massive independent tasks","authors":"Hui Li, Huashan Yu, Xiaoming Li","doi":"10.1109/MTAGS.2008.4777911","DOIUrl":"https://doi.org/10.1109/MTAGS.2008.4777911","url":null,"abstract":"This paper presents a lightweight framework for executing many independent tasks efficiently on grids of heterogeneous computational nodes. It dynamically groups tasks of different granularities and dispatches the groups onto distributed computational resources concurrently. Three strategies have been devised to improve the efficiency of computation and resource utilization. One strategy is to pack up to thousands of tasks into one request. Another is to share the effort in resource discovery and allocation among requests by separating resource allocations from request submissions. The third strategy is to pack variable numbers of tasks into different requests, where the task number is a function of the destination resource's computability. This framework has been implemented in Gracie, a computational grid software platform developed by Peking University, and used for executing bioinformatics tasks. We describe its architecture, evaluate its strategies, and compare its performance with GRAM. Analyzing the experiment results, we found that Gracie outperforms GRAM significantly for execution of sets of small tasks, which is aligned with the intuitive advantage of our approaches built in Gracie.","PeriodicalId":278412,"journal":{"name":"2008 Workshop on Many-Task Computing on Grids and Supercomputers","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117210279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. T. Thor, G. Záruba, David Levine, K. De, T. Wenaus
{"title":"ViGs: A grid simulation and monitoring tool for ATLAS workflows","authors":"A. T. Thor, G. Záruba, David Levine, K. De, T. Wenaus","doi":"10.1109/MTAGS.2008.4777909","DOIUrl":"https://doi.org/10.1109/MTAGS.2008.4777909","url":null,"abstract":"With the recent success in transmitting the first beam through Large Hadron Collider (LHC), generation of vast amount of data from experiments would soon follow in the near future. The data generated that will need to be processed will be enormous, averaging 15 petabytes per year which will be analyzed and processed by one- to two-hundred-thousand jobs per day. These jobs must be scheduled, processed and managed on computers distributed over many countries worldwide. The ability to construct computer clusters on such a virtually unbounded scale will result in increased throughput, removing the barrier of a single computing architecture and operating system, while adding the ability to process jobs across different administrative boundaries, and encouraging collaborations. To date, setting up large scale grids has been mostly accomplished by setting up experimental medium-sized clusters and using trial-and-error methods to test them. However, this is not only an arduous task but is also economically inefficient. Moreover, as the performance of a grid computing architecture is closely tied with its networking infrastructure across the entire virtual organization, such trial-and-error approaches will not provide representative data. A simulation environment, on the other hand, may be ideal for this evaluation purpose as virtually all factors within a simulated VO (virtual organization) can easily be modified for evaluation. Thus we introduceldquovirtual grid simulatorrdquo (ViGs), developed as a large scale grid environment simulator, with the goal of studying the performance, behavioral, and scalability aspects of a working grid environment, while catering to the needs for an underlying networking infrastructure.","PeriodicalId":278412,"journal":{"name":"2008 Workshop on Many-Task Computing on Grids and Supercomputers","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134271627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Many-task computing for grids and supercomputers","authors":"I. Raicu, Ian T Foster, Yong Zhao","doi":"10.1109/MTAGS.2008.4777912","DOIUrl":"https://doi.org/10.1109/MTAGS.2008.4777912","url":null,"abstract":"Many-task computing aims to bridge the gap between two computing paradigms, high throughput computing and high performance computing. Many task computing differs from high throughput computing in the emphasis of using large number of computing resources over short periods of time to accomplish many computational tasks (i.e. including both dependent and independent tasks), where primary metrics are measured in seconds (e.g. FLOPS, tasks/sec, MB/s I/O rates), as opposed to operations (e.g. jobs) per month. Many task computing denotes high-performance computations comprising multiple distinct activities, coupled via file system operations. Tasks may be small or large, uniprocessor or multiprocessor, compute-intensive or data-intensive. The set of tasks may be static or dynamic, homogeneous or heterogeneous, loosely coupled or tightly coupled. The aggregate number of tasks, quantity of computing, and volumes of data may be extremely large. Many task computing includes loosely coupled applications that are generally communication-intensive but not naturally expressed using standard message passing interface commonly found in high performance computing, drawing attention to the many computations that are heterogeneous but not ldquohappilyrdquo parallel.","PeriodicalId":278412,"journal":{"name":"2008 Workshop on Many-Task Computing on Grids and Supercomputers","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131220661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring data parallelism and locality in wide area networks","authors":"Yunhong Gu, R. Grossman","doi":"10.1109/MTAGS.2008.4777906","DOIUrl":"https://doi.org/10.1109/MTAGS.2008.4777906","url":null,"abstract":"Cloud computing has demonstrated that processing very large datasets over commodity clusters can be done simply given the right programming structure. Work to date, for example MapReduce and Hadoop, has focused on systems within a data center. In this paper, we present Sphere, a cloud computing system that targets distributed data-intensive applications over wide area networks. Sphere uses a data-parallel computing model that views the processing of distributed datasets as applying a group of operators to each element in the datasets. As a cloud computing system, application developers can use the Sphere API to write very simple code to process distributed datasets in parallel, while the details, including but not limited to, data locations, server heterogeneity, load balancing, and fault tolerance, are transparent to developers. Unlike MapReduce or Hadoop, Sphere supports distributed data processing on a global scale by exploiting data parallelism and locality in systems over wide area networks.","PeriodicalId":278412,"journal":{"name":"2008 Workshop on Many-Task Computing on Grids and Supercomputers","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129454953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}