{"title":"Performance Analysis of Grid DAG Scheduling Algorithms using MONARC Simulation Tool","authors":"Florin Pop, C. Dobre, V. Cristea","doi":"10.1109/ISPDC.2008.15","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.15","url":null,"abstract":"This paper presents a solution to analyze the performance of grid scheduling algorithms for tasks with dependencies. Finding the optimal procedures for DAG scheduling in Grid systems is important due to the latest computing necessities: large scale distributed computing and complex applications for different research areas. We propose a solution to evaluate DAG scheduling algorithms using simulation, an approach suitable to evaluate different scheduling algorithms using various task dependencies and considering a wide range of Grid system architectures. Our proposed solution is based on MONARC, a generic simulation framework designed for modeling large scale distributed systems. We present our research results in extending the simulation platform to accommodate various DAG scheduling procedures and, as a case study, we present a critical analysis of four well known DAG scheduling strategies: CCF (Cluster ready Children First), ETF (Earliest Time First), HLFET (Highest Level First with Estimated Times) and Hybrid Remapper. The obtained results show that the proposed solution is a very good instrument for evaluating performance in case of a wide range of DAG scheduling algorithms.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130238315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carlos Castañeda Marroquín, C. Navarrete, A. Ortega, M. Alfonseca, E. Anguiano
{"title":"Parallel Metropolis-Montecarlo Simulation for Potts Model using an Adaptable Network Topology based on Dynamic Graph Partitioning","authors":"Carlos Castañeda Marroquín, C. Navarrete, A. Ortega, M. Alfonseca, E. Anguiano","doi":"10.1109/ISPDC.2008.51","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.51","url":null,"abstract":"In the last years, the computers have increased their capacity of calculus and networks - for the interconnection of these machines - have been improved until obtaining the actual high rates of data transferring. The programs that now a days try to take advantage of these new technologies,cannot be written using the traditional techniques of programming,since most of the algorithms were designed for being executed in only one processor, in a non concurrent form, instead of being executed concurrently in a set of processors,working and communicating through a network.This work aims to present the ongoing development of a new method to simulate the Ferromagnetic Potts model, taking into account these new technologies.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130933118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Software probes: towards a quick method for machine characterization and application performance prediction","authors":"A. Strube, Dolores Rexachs, E. Luque","doi":"10.1109/ISPDC.2008.40","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.40","url":null,"abstract":"Computers perform different applications in different ways. To characterize an application performance into a machine, the usual method is a throughout execution of it. This work is a step into a synthetic probe able to characterize a master-worker application's performance in a fraction of the time required to run it entirely. This is specially important for CPU-intensive scientific applications, who runs for very long, as it makes sense that it runs as efficiently (and fast) as possible. To know how, and for how long a master-worker application is going to run can guide the decision to use this machine or not. Our software probe takes into account only the performance-relevant parts of the application, discovering a program's relevant phases. Running solely these significant phases is a powerful way to quickly characterize the application's performance on a machine. It can help to select the best computing nodes in a grid or in a multi-cluster to run this application, and even quickly predict the total execution time for this application/data set in the machine analyzed. We also present ongoing work on a fully synthetic probe generated from programs' phases.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128929439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Runtime System Architecture for Ubiquitous Support of OpenMP","authors":"G. C. Philos, V. Dimakopoulos, P. Hadjidoukas","doi":"10.1109/ISPDC.2008.49","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.49","url":null,"abstract":"In this work we present the runtime architecture of the OMPi OpenMP compiler. OMPi is a source-to-source C translator featuring a portable, modular and extensible runtime system. It allows for OpenMP threads to map to different execution entities which range from kernel/user-level threads to processes, providing transparent support of OpenMP applications on both SMP machines and clusters of SMPs. When operating within an SMP machine, arbitrary threading libraries can be employed; currently a multitude of such libraries is available, including one which is based on portable user-level threading, for high-performance nested parallelism support. When operating on a cluster, processes are used as the execution entities and different software DSM cores can be utilized under a unified interface; the runtime system uses a hybrid approach whereby its internal bookkeeping is done through explicit message passing, while user-program shared variables are handled by the DSM core.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"153 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114238169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Gamatie, É. Rutten, Huafeng Yu, Pierre Boulet, J. Dekeyser
{"title":"Modeling and Formal Validation of High-Performance Embedded Systems","authors":"A. Gamatie, É. Rutten, Huafeng Yu, Pierre Boulet, J. Dekeyser","doi":"10.1109/ISPDC.2008.28","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.28","url":null,"abstract":"This paper presents an approach for the modeling and formal validation of high-performance systems. The approach relies on the repetitive model of computation used to express the parallelism of such systems within the Gaspard framework, which is dedicated to the codesign of high-performance system-on-chip. The system descriptions obtained with this model are then projected on the synchronous model of computation. The result of this projection consists of an equational model that allows one to formally analyze clock synchronizability issues so as to guarantee the reliable deployment of systems on platforms.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121850699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Heterogeneous PBLAS: Optimization of PBLAS for Heterogeneous Computational Clusters","authors":"Ravi Reddy, Alexey L. Lastovetsky, P. Alonso","doi":"10.1109/ISPDC.2008.9","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.9","url":null,"abstract":"This paper presents a package, called Heterogeneous PBLAS (HeteroPBLAS), which is built on top of PBLAS and provides optimized parallel basic linear algebra subprograms for heterogeneous computational clusters. We present the user interface and the software hierarchy of the first research implementation of HeteroPBLAS. This is the first step towards the development of a parallel linear algebra package for heterogeneous computational clusters. We demonstrate the efficiency of the HeteroPBLAS programs on a homogeneous computing cluster and a heterogeneous computing cluster.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"176 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132942756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Konwar, Peter M. Musial, Alexander A. Shvartsman
{"title":"Spontaneous, Self-Sampling Quorum Systems for Ad Hoc Networks","authors":"K. Konwar, Peter M. Musial, Alexander A. Shvartsman","doi":"10.1109/ISPDC.2008.61","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.61","url":null,"abstract":"Quorum systems-collections of sets with pairwise nonempty intersections-are used in distributed settings to implement services such as consensus and consistent memory. Quorums have been substantially studied in static settings, however the design and analysis of quorum-based distributed services in resource-limited ad hoc networks is a relatively unexplored area. The pioneering work of Chockler, Gilbert, and Patt-Shamir considers such networks and proposes an implementation of probabilistic quorum systems with per-node communication bit complexity of O(log2 n), where n is the number of nodes. The authors assumes a priori knowledge of node failure probability p, where 0 ¿ p < 1/4. Additionally their work overlooks the cost of gathering responses from quorum members by the client. We present a new probabilistic quorum construction with a lower, per quorum access, communication bit complexity of O(log n) for multi-hop networks. Our quorum access algorithm is based on self-sampling by the nodes themselves, in a way equivalent to accessing a quorum set, with high probability. In addition, we provide a novel on-line algorithm to estimate the node failure probability parameter p, thus removing the assumption that it is known a priori. This is accomplished with per node communication bit complexity of O(log2 n). We demonstrate the utility of our construction by presenting a single-writer, multi-reader algorithm that uses our probabilistic quorums to implement atomic objects in ad hoc networks, where consistency is guaranteed with high probability. We include simulation results illustrating the high probability guarantee for our atomic memory service.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126145177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Exhaustive Comparison Framework for Distributed Shape Differentiation in a MEMS Sensor Actuator Array","authors":"Eugen Dedu, Kahina Boutoustous, J. Bourgeois","doi":"10.1109/ISPDC.2008.55","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.55","url":null,"abstract":"The Smart Surface1 project aims at designing an integrated micro-manipulator based on an array of micromodules connected in a 2D array network. Each micromodule has a sensor, an actuator and a processing unit. One of the aims of the processing unit is to recognize the shape of the part that is put on top of the smart surface. This recognition or more precisely this differentiation is done through a distributed algorithm that we call a criterion. The aim of this article is to present the ECO framework, which is able to test exhaustively the efficiency of different differentiation criteria, in terms of differentiation efficiency, memory and processing power needed. The tests show that ECO is of great help for choosing the best criteria to implement inside our smart surface.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129225895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scheduled Routing in an Optical Hypercube","authors":"Risto T. Honkanen","doi":"10.1109/ISPDC.2008.16","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.16","url":null,"abstract":"In this work we present an all-optical hypercube architecture and a systolic routing protocol for it. An r-dimensional optical hypercube network (OHC) consists of N = 2r processing nodes and r2r optical links. We study a systolic routing protocol that is based on cyclic changes of states of routers and scheduled sendings of packets. The protocol ensures that no electro-optical conversions are needed in the intermediate routing nodes and all the packets injected into the routing machinery reach their targets without collisions. A work-optimal routing of an h-relation is achieved with a reasonable size of h in omega(NlogN).","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116117505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Randomized Online File Allocation on Uniform Ring Networks","authors":"Akira Matsubayashi, Y. Kawamura","doi":"10.1109/ISPDC.2008.27","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.27","url":null,"abstract":"We study the online file allocation problem on ring networks. In this paper, we present a 7-competitive randomized algorithm against an adaptive online adversary on uniform ring networks. The algorithm is deterministic if the file size is 1. Moreover, we obtain lower bounds of 4.25 and 3.833 for a deterministic algorithm and a randomized algorithm against an adaptive online adversary, respectively, on ring networks.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125099638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}