{"title":"Rectangular vs. triangular minimal routing and performance study","authors":"D. Désérable, R. Hoffmann","doi":"10.1109/HPCSim.2012.6266917","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266917","url":null,"abstract":"This paper presents a comparative study of the XY-routing protocol in the square grid “S” examined herein with the XYZ protocol in the triangular grid “T ” examined elsewhere, both with toroidal connections and N nodes. The routing problem (also called multiple target searching) is performed in a partitioned cellular automata network with agents (or messages) moving from sources to targets, preferably on their minimal routes. The network in S consists of N nodes with 4 buffers per node. Buffers with the same names are connected to their neighboring nodes via unidirectional links. Each buffer may host an agent and each agent situated in a buffer has a computed direction defining the new buffer in the adjacent node. Two scenarios are examined: (i) N-1 agents are moving to a common target (also called “all-to-one gathering”) (ii) N/2 agents are moving to N/2 targets. It is shown that in both cases the T grid is 1.5 times faster than the S grid. - The deterministic minimal routing protocols were also randomized, with agents choosing a random direction in order to cope with congestion and deadlocks. It is shown that randomization can slightly shorten the transfer time in case of congestion, but, more important, deadlocks can be resolved.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124022428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High performance smart expression template math libraries","authors":"Klaus Iglberger, G. Hager, Jan Treibig, U. Rüde","doi":"10.1109/HPCSim.2012.6266939","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266939","url":null,"abstract":"Performance is of utmost importance for linear algebra libraries since they usually are the core of numerical and simulation packages and use most of the available compute time and resources. However, especially in large scale simulation frameworks the readability and ease of use of mathematical expressions is essential for a continuous maintenance, modification, and extension of the software framework. Based on these requirements, in the last decade C++ Expression Templates have gained a reputation as a suitable means to combine an elegant, domain-specific, and intuitive user interface with “HPC-grade” performance. Unfortunately, many of the available ET-based frameworks fall short of the expectation to deliver high performance, adding to the general mistrust towards C++ math libraries. In this paper we present performance results for Smart Expression Template libraries, demonstrating that by proper combination of high-level C++ code and low-level compute kernels both requirements, an elegant interface and high performance, can be achieved.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127079306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bid writing: Is project management different? What is appropriate?","authors":"A. Bochenkov, C. Maple, P. Sant","doi":"10.1109/HPCSim.2012.6266949","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266949","url":null,"abstract":"This will provide you with a brief advice on how to write up a high-quality EU grant proposal, specifically in a part of management structure and procedures. It will give a good insight in regard to key points to be addressed. A consortium management structure is to be tailored to a specific project. It also depends on a project scale, its duration and type. Therefore, it is crucial to deploy an appropriate management structure, not only for successful project proposal but also for a successful project itself.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114466671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated validation of trust and security of service-oriented architectures with the AVANTSSAR platform","authors":"L. Viganò","doi":"10.1109/HPCSim.2012.6266956","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266956","url":null,"abstract":"Cutting-edge network infrastructures such as Service-Oriented Architectures (SOAs) or, more generally, the Internet of Services (IoS) entail a major paradigm shift in the way ICT systems and applications are designed, implemented, deployed and consumed: they are no longer the result of programming components in the traditional meaning but are built by composing services that are distributed over the network and reconfigured and consumed dynamically in a demand-driven, flexible way. However, the new opportunities opened by the IoS will only materialize if concepts, techniques and tools are provided to ensure security. In fact, deploying services in such network infrastructures entails a wide range of trust and security issues, but solving them is extremely hard since making the service components trustworthy is not sufficient: composing services leads to new, subtle and dangerous, vulnerabilities due to interference between component services and policies, the shared communication layer, and application functionality. Thus, one needs validation of both the service components and their composition into secure service architectures.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125737027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mapping the PRAM model onto the Intel SCC many-core processor","authors":"Carsten Clauss, Stefan Lankes, T. Bemmerl","doi":"10.1109/HPCSim.2012.6266943","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266943","url":null,"abstract":"The Parallel Random Access Machine (PRAM) model describes an abstract register machine for analyzing the complexity and scalability of parallel algorithms. Unfortunately, it is not possible to implement this model directly in hardware but it is at least possible to emulate this abstract model on more realistic parallel machines. Moreover, the recent evolution of processor architectures towards a forthcoming many-core era seems to indicate that PRAM-derived hardware architectures may even become important in the near future. The Single-chip Cloud Computer (SCC) is a recent example for an experimental many-core processor. By means of this processor, researchers have the opportunity to investigate the requirements of tomorrow's software design and programming models. In this paper, we discuss if and how the PRAM model could be mapped onto the SCC by exploiting its many-core related hardware features.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129355164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Alexandru, T. Monteil, F. Coccetti, H. Aubert, P. Lorenz
{"title":"Large electromagnetic problem on large scale parallel computing systems","authors":"M. Alexandru, T. Monteil, F. Coccetti, H. Aubert, P. Lorenz","doi":"10.1109/HPCSim.2012.6266968","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266968","url":null,"abstract":"This paper deals with the electromagnetic modeling of large and complex electrical structures by means of large scale parallel systems, such as Grid Computing and supercomputer. Transmission-Line Matrix modeling method is applied to homogeneous volumes. The planar structures are modelled with the mode matching approach. The results prove the benefits of the grid computing and supercomputer environments to solve electrically large structures. A prediction model for computing performances on grid, based on a hybrid approach that combines a historic-based prediction and an application profile-based prediction, has been developped. The predicted values are in good agreement with the measured values.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117303516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Plóciennik, T. Zok, A. Gómez-Iglesias, Francisco Castejón-Magaña, Andrés Bustos, M. Pascual, Jose Luis Velasco
{"title":"Workflows orchestration in distributed computing infrastructures","authors":"M. Plóciennik, T. Zok, A. Gómez-Iglesias, Francisco Castejón-Magaña, Andrés Bustos, M. Pascual, Jose Luis Velasco","doi":"10.1109/HPCSim.2012.6266982","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266982","url":null,"abstract":"This paper presents an overview of the new developments carried out to offer a reliable and efficient support for different computing infrastructures in the Kepler workflow orchestration system. The aim of the work is to help scientists to transparently use these infrastructures regardless of the underlying middleware. We introduce new complex workflow scenarios developed in the context of the EU EGI Inspire project and their exploitation by two nuclear fusion applications. The presented use cases represent typical scenarios which are common also in other fields and could be easily reused for other applications.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130040214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel computations in Java with PCJ library","authors":"Marek Nowicki, P. Bała","doi":"10.1109/HPCSIM.2012.6266941","DOIUrl":"https://doi.org/10.1109/HPCSIM.2012.6266941","url":null,"abstract":"In this paper we present PCJ - a new library for parallel computations in Java inspired by the partitioned global address space approach. We present design details together with the examples of usage for basic operations such as a point-point communication, synchronization or broadcast. The PCJ library is easy to use and allows for a fast development of the parallel programs. It allows to develop distributed applications in easy way, hiding communication details which allows user to focus on the implementation of the algorithm rather than on network or threads programming. The objects and variables are local to the program threads however some variables can be marked shared which allows to access them from different nodes. For shared variables PCJ offers one-sided communication which allows for easy access to the data stored at the different nodes. The parallel programs developed with PCJ can be run on the distributed systems with different Java VM running on the nodes. In the paper we present evaluation of the performance of the PCJ communication on the state of art hardware. The results are compared with the native MPI implementation showing good performance and scalability of the PCJ.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130308310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pedro J. Martín, Luis F. Ayuso, Roberto Torres, Antonio Gavilanes
{"title":"Algorithmic strategies for optimizing the parallel reduction primitive in CUDA","authors":"Pedro J. Martín, Luis F. Ayuso, Roberto Torres, Antonio Gavilanes","doi":"10.1109/HPCSim.2012.6266966","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266966","url":null,"abstract":"Many general-purpose applications exploit Graphics Processing Units (GPUs) by executing a set of well-known dataparallel primitives. Those primitives are usually invoked from the host many times, so their throughput has a great impact on the performance of the overall system. Thus, the study of novel algorithmic strategies to optimize their implementation on current devices is an interesting topic to the GPU community. In this paper we focus on optimizing the reduction primitive, which merely reduces a data sequence into a single value using a binary associative operator. Although tree-based and sequential-based algorithms have been already implemented on GPUs, a comparison of both algorithm performance had not been carried out yet. Thus, our first contribution is to present an experimental study of state-of-the-art reduction algorithms on CUDA. Next we introduce two algorithmic optimizations that are integrated into the fastest solution (a sequential-based algorithm), improving its throughput even more. Finally, we replicate this methodology to the segmented version of the primitive, which applies when the input is composed of several independent segments. In this case, it is not clear which algorithm exhibits the best performance, since throughput deeply depends on the distribution of segments along the input. According to our results, tree-based algorithms run faster for small segments, while sequential methods are better for medium and large ones.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132547569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evolutionary power-aware routing in VANETs using Monte-Carlo simulation","authors":"J. Toutouh, Sergio Nesmachnow, E. Alba","doi":"10.1109/HPCSim.2012.6266900","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266900","url":null,"abstract":"This work addresses the reduction of power consumption of the AODV routing protocol in vehicular networks as an optimization problem. Nowadays, network designers focus on energy-aware communication protocols, specially to deploy wireless networks. Here, we introduce an automatic method to search for energy-efficient AODV configurations by using an evolutionary algorithm and parallel Monte-Carlo simulations to improve the accuracy of the evaluation of tentative solutions. The experimental results demonstrate that significant power consumption improvements over the standard configuration can be attained, with no noteworthy loss in the quality of service.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"311 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122231378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}