{"title":"Sorting Algorithms Implemented Using JavaSpaces","authors":"C. Muresan","doi":"10.1109/ISPDC.2012.27","DOIUrl":"https://doi.org/10.1109/ISPDC.2012.27","url":null,"abstract":"Java Spaces technology is a Java implementation of a tuple-based system providing a programming model that views an application as a collection of processes cooperating via the flow of objects into and out of one or more spaces. A space is a shared, network-accessible repository for objects together with their behavior. Processes perform simple operations to write objects into a space, take or read objects from a space that are of interest to them. This technology is suited for parallel computation applications and provides tremendous benefits in terms of scalability and fault-tolerance. I have described how three sorting algorithms can be implemented using Java Spaces technology.","PeriodicalId":287900,"journal":{"name":"2012 11th International Symposium on Parallel and Distributed Computing","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114078464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stabilizing Peer-to-Peer Systems Using Public Cloud: A Case Study of Peer-to-Peer Search","authors":"D. Ram, H. Haridas","doi":"10.1109/ISPDC.2012.26","DOIUrl":"https://doi.org/10.1109/ISPDC.2012.26","url":null,"abstract":"Co-operative peer-to-peer systems have lot of relevance due to their desirable properties like lack of centralized control and transparency. But in certain applications, the stability of peer-to-peer systems is affected when concentrated load spikes occur. In this work, we explore the case for cloud-assisted peer-to-peer systems to handle spikes using peer-to-peer search as a case study. We identify the issues involved in realizing cloud-assisted peer-to-peer systems and propose, implement and evaluate Cloud-Assisted Peer-to-Peer Search (CAPS) architecture which fits in with the co-operative nature of peer-to-peer systems. The experimental results show that CAPS provides stability to peer-to-peer search service during query spikes without affecting user experience adversely.","PeriodicalId":287900,"journal":{"name":"2012 11th International Symposium on Parallel and Distributed Computing","volume":"280 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114488943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementation of Data-Parallel Skeletons: A Case Study Using a Coarse-Grained Hierarchical Model","authors":"Chong Li, F. Gava, Gaétan Hains","doi":"10.1109/ISPDC.2012.12","DOIUrl":"https://doi.org/10.1109/ISPDC.2012.12","url":null,"abstract":"Writing parallel programs is known to be notoriously difficult. Often programmers do not want to reason about message-passing algorithms and only want to combine existing high-level patterns to produce their parallel program. This is the algorithmic skeletons approach to parallel programming. It improves reliability and clarity of source code. But skeletons can be insufficient when complicated communication schemes are needed. Expressing skeletons in a more general and low level language in the form of a library seems to be a good compromise between simplicity and expressive power. In this article, we present a coarsed-grained implementation using a hierarchical model of a set of data-parallel skeletons. Programming experiments and benchmarks complete the article.","PeriodicalId":287900,"journal":{"name":"2012 11th International Symposium on Parallel and Distributed Computing","volume":"152 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122932647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Need of Software Engineering Methods for High Performance Computing Applications","authors":"M. Schmidberger, B. Brügge","doi":"10.1109/ISPDC.2012.14","DOIUrl":"https://doi.org/10.1109/ISPDC.2012.14","url":null,"abstract":"This paper presents the results of an online survey with more than a hundred High Performance Computing (HPC) community members. The goal of the survey was to get a deeper understanding of ongoing HPC projects and the project participants' knowledge about software engineering. Previous and current research give the impression that with software engineering methods adequate for the development of HPC applications time and effort can be saved. But the knowledge of the right and most beneficial methods is not easily available for HPC developers. The results of the survey confirm that there is only little use of software engineering in HPC and, as outlined in the related work section, there is so far no general approach to spread the use of software engineering in the HPC community. But according to our findings, deploying software engineering to facilitate the development of HPC applications is certainly a matter of interest.","PeriodicalId":287900,"journal":{"name":"2012 11th International Symposium on Parallel and Distributed Computing","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121231004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jed Brown, M. Knepley, D. May, L. McInnes, Barry F. Smith
{"title":"Composable Linear Solvers for Multiphysics","authors":"Jed Brown, M. Knepley, D. May, L. McInnes, Barry F. Smith","doi":"10.1109/ISPDC.2012.16","DOIUrl":"https://doi.org/10.1109/ISPDC.2012.16","url":null,"abstract":"The Portable, Extensible Toolkit for Scientific computing (PETSc), which focuses on the scalable solution of problems based on partial differential equations, now incorporates new components that allow full compos ability of solvers for multiphysics and multilevel methods. Through strong encapsulation, we achieve arbitrary, dynamic composition of hierarchical methods for coupled problems and allow customization of all components in composite solvers. For example, we support block decompositions with nested multigrid as well as multigrid on the fully coupled system with block-decomposed smoothers. This paper provides an overview of PETSc's new multiphysics capabilities, which have been used in parallel applications including lithosphere dynamics, subduction and mantle convection, ice sheet dynamics, subsurface reactive flow, fusion, mesoscale materials modeling, and power networks.","PeriodicalId":287900,"journal":{"name":"2012 11th International Symposium on Parallel and Distributed Computing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128117533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploiting State-of-the-Art x86 Architectures in Scientific Computing","authors":"A. Heinecke, T. Auckenthaler, C. Trinitis","doi":"10.1109/ISPDC.2012.15","DOIUrl":"https://doi.org/10.1109/ISPDC.2012.15","url":null,"abstract":"In recent years, general purpose ×86 architectures have undergone significant modifications towards high performance computing capabilities. Lately, technologies like wider vector units or Fused Multiply-Add (FMA) instruction, which were mainly known from GPU arcitectures, have been introduced. In this paper, we examine the performance of current ×86 architectures, namely Intel Sandy Bridge and AMD Bulldozer, for four different parallel workloads with different properties. These properties comprise optimally cache-blocked algorithms as well as adaptive grid structures resulting in memory latency and bandwidth bound executions. The achieved performance on both architectures is very promising, and, if extrapolated towards upcoming server silicon, can be regarded as on par with current high-end GPU based accelerators.","PeriodicalId":287900,"journal":{"name":"2012 11th International Symposium on Parallel and Distributed Computing","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125917523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Memory-Efficient Implementation of a Rigid-Body Molecular Dynamics Simulation","authors":"W. Eckhardt, T. Neckel","doi":"10.1109/ISPDC.2012.22","DOIUrl":"https://doi.org/10.1109/ISPDC.2012.22","url":null,"abstract":"Molecular dynamics simulations are usually optimized with regard to runtime rather than memory consumption. In this paper, we investigate two distinct implementational aspects of the frequently used Linked-Cell algorithm for rigid-body molecular dynamics simulations: the representation of particle data for the force calculation, and the layout of data structures in memory. We propose a low memory footprint implementation, which comes with no costs in terms of runtime. To prove the approach, it was implemented in the programme Mardyn and evaluated on a standard cluster as well as on a Blue Gene/P for representative scenarios.","PeriodicalId":287900,"journal":{"name":"2012 11th International Symposium on Parallel and Distributed Computing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122042164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tuning a Finite Difference Computation for Parallel Vector Processors","authors":"G. Zumbusch","doi":"10.1109/ISPDC.2012.17","DOIUrl":"https://doi.org/10.1109/ISPDC.2012.17","url":null,"abstract":"Current CPU and GPU architectures heavily use data and instruction parallelism at different levels. Floating point operations are organised in vector instructions of increasing vector length. For reasons of performance it is mandatory to use the vector instructions efficiently. Several ways of tuning a model problem finite difference stencil computation are discussed. The combination of vectorisation and an interleaved data layout, cache aware algorithms, loop unrolling, parallelisation and parameter tuning lead to optimised implementations at a level of 90% peak performance of the floating point pipelines on recent Intel Sandy Bridge and AMD Bulldozer CPU cores, both with AVX vector instructions as well as on Nvidia Fermi/ Kepler GPU architectures. Furthermore, we present numbers for parallel multi-core/ multi-processor and multi-GPU configurations. They represent regularly more than an order of speed up compared to a standard implementation. The analysis may also explain deficiencies of automatic vectorisation for linear data layout and serve as a foundation of efficient implementations of more complex expressions.","PeriodicalId":287900,"journal":{"name":"2012 11th International Symposium on Parallel and Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133337777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Srishti Srivastava, Nitin Sukhija, I. Banicescu, F. Ciorba
{"title":"Analyzing the Robustness of Dynamic Loop Scheduling for Heterogeneous Computing Systems","authors":"Srishti Srivastava, Nitin Sukhija, I. Banicescu, F. Ciorba","doi":"10.1109/ISPDC.2012.29","DOIUrl":"https://doi.org/10.1109/ISPDC.2012.29","url":null,"abstract":"Scheduling scientific applications in parallel on non-dedicated, heterogeneous systems, where the computing resources may differ in availability, is a challenging task, and requires efficient execution and robust scheduling methods. Dynamic loop scheduling methods provide means to achieve the desired robust performance. These methods are based on probabilistic analyses and are inherently robust. However, a methodology is required to measure the robustness of the dynamic loop scheduling methods that ensures their performance in unpredictably changing computing environments. In this paper, a methodology is proposed for performing robustness analysis of the dynamic loop scheduling techniques using a metric, formulated in earlier work, to measure their robustness in heterogeneous computing systems with uncertainties. The dynamic loop scheduling methods have been implemented in a simulation. The experimental results were used as an input to the proposed methodology, which in turn has been used to experimentally analyze the robustness of a number of dynamic loop scheduling methods on a heterogeneous system with variable availability.","PeriodicalId":287900,"journal":{"name":"2012 11th International Symposium on Parallel and Distributed Computing","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128387607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards a Service Friendly Cloud Ecosystem","authors":"Teodor-Florin Fortiş, V. Munteanu, V. Negru","doi":"10.1109/ISPDC.2012.31","DOIUrl":"https://doi.org/10.1109/ISPDC.2012.31","url":null,"abstract":"After the large penetration of Cloud Computing, more and more developers are taking into account migrating their applications to the Cloud, in order to take advantage of the characteristics of this new environment. In close relation with application migration, an increasing number of development and execution platforms, delivered as PaaS solutions (such as mOSAIC, 4CaaSt, Cloud Foundry, Open Shift, Stackato, and others) are offering their services for development, deployment, and execution of applications that are using in an optimum manner the five characteristics of the Cloud. Following this massive migration of applications, especially from SOA, to Cloud environments, new requirements for application development could be identified in order to enable the construction of complex solutions, and to exploit a business level on the top of various *-as-a-Service layers. The introduction of a centralized component, the Cloud Governance, is necessary in order to enable the development of complex cloud ecosystems. This centralized component is extending, complementing, completing and integrating core features from the PaaS layer, like monitoring, provisioning, negotiation, and others, and integrate features of various Cloud management solutions.","PeriodicalId":287900,"journal":{"name":"2012 11th International Symposium on Parallel and Distributed Computing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128649749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}