2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum最新文献_第8页

Parallel Circuit Simulation on Multi/Many-core Systems 多/多核系统的并行电路仿真

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.319

Xiaoming Chen, Yu Wang, Huazhong Yang

引用次数: 2

Parallelizing the Computation of Green Functions for Computational Electromagnetism Problems 计算电磁学问题格林函数的并行化计算

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.174

C. Pérez-Alcaraz, D. Giménez, Alejandro Álvarez Melcón, F. Quesada-Pereira

引用次数: 3

Modeling Power and Energy Usage of HPC Kernels 高性能计算内核的功率和能耗建模

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.121

Ananta Tiwari, M. Laurenzano, L. Carrington, A. Snavely

{"title":"Modeling Power and Energy Usage of HPC Kernels","authors":"Ananta Tiwari, M. Laurenzano, L. Carrington, A. Snavely","doi":"10.1109/IPDPSW.2012.121","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.121","url":null,"abstract":"Compute intensive kernels make up the majority of execution time in HPC applications. Therefore, many of the power draw and energy consumption traits of HPC applications can be characterized in terms of the power draw and energy consumption of these constituent kernels. Given that power and energy-related constraints have emerged as major design impediments for exascale systems, it is crucial to develop a greater understanding of how kernels behave in terms of power/energy when subjected to different compiler-based optimizations and different hardware settings. In this work, we develop CPU and DIMM power and energy models for three extensively utilized HPC kernels by training artificial neural networks. These networks are trained using empirical data gathered on the target architecture. The models utilize kernel-specific compiler-based optimization parameters and hard-ware tunables as inputs and make predictions for the power draw rate and energy consumption of system components. The resulting power draw and energy usage predictions have an absolute error rate that averages less than 5.5% for three important kernels - matrix multiplication (MM), stencil computation and LU factorization.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129666601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 63

Efficient Reconfiguration Algorithm for Three-dimensional VLSI Arrays 三维VLSI阵列的高效重构算法

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.29

Guiyuan Jiang, W. Jigang, Ji-zhou Sun

引用次数: 6

Managing Dynamic Reconfiguration for Fault-tolerance on a Manycore Architecture 在多核体系结构中管理动态重构以实现容错

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.38

Z. Ul-Abdin, Essayas Gebrewahid, B. Svensson

{"title":"Managing Dynamic Reconfiguration for Fault-tolerance on a Manycore Architecture","authors":"Z. Ul-Abdin, Essayas Gebrewahid, B. Svensson","doi":"10.1109/IPDPSW.2012.38","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.38","url":null,"abstract":"With the advent of many core architectures comprising hundreds of processing elements, fault management has become a major challenge. We present an approach that uses the occam-pi language to manage the fault recovery mechanism on a new many core architecture, the Platform 2012 (P2012). The approach is made possible by extending our previously developed compiler framework to compile occam-pi implementations to the P2012 architecture. We describe the techniques used to translate the salient features of the occam-pi language to the native programming model of the P2012 architecture. We demonstrate the applicability of the approach by an experimental case study, in which the DCT algorithm is implemented on a set of four processing elements. During run-time, some of the tasks are then relocated from assumed faulty processing elements to the faultless ones by means of dynamic reconfiguration of the hardware. The working of the demonstrator and the simulation results illustrate not only the feasibility of the approach but also how the use of higher-level abstractions simplifies the fault handling.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"415 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128831321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A Portable High-Productivity Approach to Program Heterogeneous Systems 一种可移植的高生产率异构系统编程方法

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.15

Z. Bozkus, B. Fraguela

{"title":"A Portable High-Productivity Approach to Program Heterogeneous Systems","authors":"Z. Bozkus, B. Fraguela","doi":"10.1109/IPDPSW.2012.15","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.15","url":null,"abstract":"The exploitation of heterogeneous resources is becoming increasingly important for general purpose computing. Unfortunately, heterogeneous systems require much more effort to be programmed than the traditional single or even multi-core computers most programmers are familiar with. Not only new concepts, but also new tools with different restrictions must be learned and applied. Additionally, many of these approaches are specific to one vendor or device, resulting in little portability or rapid obsolescence for the applications built on them. Open standards for programming heterogeneous systems such as OpenCL contribute to improve the situation, but the requirement of portability has led to a programming interface more complex than that of other approaches. In this paper we present a novel library-based approach to programming heterogeneous systems that couples portability with ease of use. Our evaluations indicate that while the performance of our library, called Heterogeneous Programming Library (HPL), is on par with that of OpenCL, the current standard for portable heterogeneous computing, the programming effort required by HPL is 3 to 10 times smaller than that of OpenCL based on the authors` implementation of five benchmarks.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115960570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Courses in High-performance Computing for Scientists and Engineers 面向科学家和工程师的高性能计算课程

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.169

R. Vuduc, Kenneth Czechowski, Aparna Chandramowlishwaran, JeeWhan Choi

引用次数: 1

Implementation of XcalableMP Device Acceleration Extention with OpenCL 用OpenCL实现XcalableMP设备加速扩展

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.296

Takuma Nomizu, D. Takahashi, Jinpil Lee, T. Boku, M. Sato

{"title":"Implementation of XcalableMP Device Acceleration Extention with OpenCL","authors":"Takuma Nomizu, D. Takahashi, Jinpil Lee, T. Boku, M. Sato","doi":"10.1109/IPDPSW.2012.296","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.296","url":null,"abstract":"Due to their outstanding computational performance, many acceleration devices, such as GPUs, the Cell Broadband Engine (Cell/B.E.), and multi-core computing are attracting a lot of attention in the field of high-performance computing. Although there are many programming models and languages de-signed for programming accelerators, such as CUDA, AMD Accelerated Parallel Processing (AMD APP), and OpenCL, these models remain difficult and complex. Furthermore, when programming for accelerator-enhanced clusters, we have to use an inter-node programming interface, such as MPI to coordinate the nodes. In order to address these problems and reduce complexity, an extension to XcalableMP (XMP), a PGAS language, for use on accelerator-enhanced clusters, called XcalableMP Device Acceleration Extension (XMP-dev), is proposed. In XMP-dev, a global distributed data is mapped onto distributed memory of each accelerator, and a fragment of codes can be of-floaded to execute in a set of accelerators. It eliminates the complex programming between nodes and accelerators and between nodes. In this paper, we present an implementation of the XMP-dev runtime library with the OpenCL APIs, while the previous implementation targets CUDA-only. Since OpenCL is a standardized interface supported for various kinds of accelerators, it improves the portability of XMP-dev and reduces the cost of development. In the result of performance evaluation, we show that the OpenCL implementation of XMP-dev can generate portable programs that can run on not only NVIDIA GPU-enhanced clusters but also various accelerator-enhanced clusters.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126488215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Deploying Scalable and Secure Secret Sharing with GPU Many-Core Architecture 利用GPU多核架构部署可扩展和安全的秘密共享

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.173

Su Chen, Ling Bai, Yi Chen, Hai Jiang, Kuan-Ching Li

{"title":"Deploying Scalable and Secure Secret Sharing with GPU Many-Core Architecture","authors":"Su Chen, Ling Bai, Yi Chen, Hai Jiang, Kuan-Ching Li","doi":"10.1109/IPDPSW.2012.173","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.173","url":null,"abstract":"Secret sharing is an excellent alternative to the traditional cryptographic algorithms due to its unkeyed encryption/decryption and fault tolerance features. Key management hassle faced in most encryption strategies is removed from users and the loss of a certain number of data copies can be tolerated. However, secret sharing schemes have to deal with two contradictory design goals: security and performance. Without keys' involvement, large security margin is expected for the illusion of being computationally secure. In the meantime, such design will degrade the performance of \"encrypting\" and \"decrypting\" secrets. Thus, secret sharing is mainly for small data such as keys and passwords. In order to apply secret sharing to large data sets, this paper redesigned the original schemes to balance the security and performance. With sufficient security margin, Graphics Processing Unit (GPU) is adopted to provide the performance satisfaction. The proposed secret sharing scheme with GPU acceleration is a practical choice for large volume data security. It is particularly good for long-term storage for its unkeyed encryption and fault tolerance. Performance analysis and experimental results have demonstrated the effectiveness and efficiency of the proposed scheme.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127789962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Engineering a New Curriculum: Experiences at Ohio University in Incorporating the IEEE-TCPP Curriculum Initiative During a Transition to Semesters 新课程的设计:俄亥俄大学在学期过渡期间整合IEEE-TCPP课程倡议的经验

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.167

D. Juedes, Frank Drews

{"title":"Engineering a New Curriculum: Experiences at Ohio University in Incorporating the IEEE-TCPP Curriculum Initiative During a Transition to Semesters","authors":"D. Juedes, Frank Drews","doi":"10.1109/IPDPSW.2012.167","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.167","url":null,"abstract":"This paper describes the efforts at Ohio University to incorporate selected topics from the IEEE-TCPP Curriculum Initiative into the Computer Science/Computer Engineering curriculum prior to a transition to semesters at Ohio University that will occur in the Fall of 2012. In particular, this paper describes our efforts to incorporate (and evaluate) selected elements of the IEEE-TCPP Curriculum Initiative into three courses in order to best determine the appropriate placement of topics related to parallel and distributed computing in the new CS/CpE curriculum under the semester calendar. In particular, we plan to add or revise existing modules and assignments for CS2 (CS 240B, CS 240C at Ohio University, CS 2401 under semesters), DS/A (CS 361 Data Structures at Ohio University, CS 3610 under semesters), and Systems (CS 442 Operating Systems and Computer Architecture I, CS 4420 under semesters) to help us determine which curricular recommendations belong in those three courses in the new semesters curriculum and which topics are more appropriately placed in new required courses entitled EE 3613 Computer Organization and CS 4000 Introduction to Parallel, Distributed, and Web-Centric Computing or in other existing advanced courses such as CS 4040 Design and Analysis of Algorithms or CS 4100 Formal Languages and Compilers.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129103267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3