2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems最新文献

Acceleration for MPI Derived Datatypes Using an Enhancer of Memory and Network 使用内存和网络增强器加速MPI派生数据类型

2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems Pub Date : 2008-01-21 DOI: 10.1007/978-3-540-87475-1_46

N. Tanabe, H. Nakajo

引用次数: 3

Introspection-Based Fault Tolerance for COTS-Based High-Capability Computation in Space 基于自省的空间高容量计算容错

2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems Pub Date : 2008-01-21 DOI: 10.1109/IWIA.2008.11

M. James, A. Shapiro, P. Springer, H. Zima

{"title":"Introspection-Based Fault Tolerance for COTS-Based High-Capability Computation in Space","authors":"M. James, A. Shapiro, P. Springer, H. Zima","doi":"10.1109/IWIA.2008.11","DOIUrl":"https://doi.org/10.1109/IWIA.2008.11","url":null,"abstract":"Future missions of deep space exploration face the challenge of designing, building,and operating progressively more capable autonomous spacecraft and planetary rovers. Given the communication latencies and bandwidth limitations for such missions, the need for increased autonomy becomes mandatory, along with the requirement for enhanced on-board computational capabilities while in deep space or time-critical situations. This will result in dramatic changes in the way missions will be conducted and supported by on-board computing systems. Specifically, the traditional approach of relying exclusively on radiation-hardened hardware and modular redundancy will not be able to deliver the required computational power. As a consequence, such systems are expected to include high-capability low-power components based on emerging Commercial-Off-The-Shelf (COTS) multi-core technology. This paper describes the design of a generic framework for introspection that supports runtime monitoring and analysis of program execution as well as a feedback-oriented recovery from faults. One of the first applications of this framework will be to provide flexible software fault tolerance matched to the requirements and properties of applications by exploiting knowledge that is either contained in an application knowledge base, provided by users, or automatically derived from specifications. A prototype implementation is currently in progress at the Jet Propulsion Laboratory, California Institute of Technology, targeting a cluster of Cell Broadband Engines.","PeriodicalId":220234,"journal":{"name":"2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131605921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A PLD Architecture for High Performance Computing 一种用于高性能计算的PLD体系结构

2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems Pub Date : 2008-01-21 DOI: 10.1109/IWIA.2008.12

Naoki Hirakawa, Masanori Yoshihara, K. Tanigawa, T. Hironaka, Masayuki Sato

{"title":"A PLD Architecture for High Performance Computing","authors":"Naoki Hirakawa, Masanori Yoshihara, K. Tanigawa, T. Hironaka, Masayuki Sato","doi":"10.1109/IWIA.2008.12","DOIUrl":"https://doi.org/10.1109/IWIA.2008.12","url":null,"abstract":"In recent years, Field Programmable Gate Arrays (FPGAs) have been used for High Performance Computing (HPC). Because there is a significantly difference between configuration speed of FPGA and execution speed of Central Processing Unit (CPU), the difference causes performance degradation. To resolve of this problem, we proposed MPLD as a new Programmable Logic Device (PLD) architecture with high speed reconfiguration. The merits of the MPLD in HPC are high speed configuration and easy partial configuration.This is achieved by the configuration method which is same as write memory access of conventional parallel memory. In this paper, we describe the problems of FPGA on using it in HPC, and present the MPLD architecture which solves the problems. Some evaluation results of the prototype MPLD chip which implemented by using five metal layers ROHM 0.18¿m CMOS technology are also presented. As results, memory capacity of the prototype MPLD was 49152bit, and the core area was 1767.54 × 1690.96¿m2 and the number of metal layers used for wiring was three. The achieved configuration time is about 6.6¿sec for whole prototype MPLD. The configuration speed of the prototype MPLD is about 11.7 times higher than AS configuration used for Altera FPGAs.","PeriodicalId":220234,"journal":{"name":"2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127912054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

The Shape of Things to Come: Future Potential of "Heavy Node" Multi-Core HPC Architectures 未来趋势：重节点 "多核高性能计算架构的未来潜力

2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems Pub Date : 2008-01-21 DOI: 10.1109/IWIA.2008.13

P. Kogge

引用次数: 0

Low-Power and High-Performance Communication Mechanism for Dependable Embedded Systems 可靠嵌入式系统的低功耗高性能通信机制

2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems Pub Date : 2008-01-21 DOI: 10.1109/IWIA.2008.8

T. Hanawa, T. Boku, Shin'ichi Miura, Takayuki Okamoto, M. Sato, K. Arimoto

引用次数: 9

Unified Programming Environment for Heterogeneous Distributed Parallel Systems 异构分布式并行系统的统一编程环境

2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems Pub Date : 2008-01-21 DOI: 10.1109/IWIA.2008.16

S. Hirasawa, H. Honda

引用次数: 0

Automatic Application of Last-Touch Instructions for Leakage Energy Reduction 自动应用最后触控指令减少泄漏能量

2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems Pub Date : 2008-01-21 DOI: 10.1109/IWIA.2008.10

Kiyofumi Tanaka, Junji Yamano

引用次数: 0

Effect of Reordering Internal Messages in MPI Broadcast According to the Load Imbalance 根据负载不平衡对MPI广播内部消息重排序的影响

2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems Pub Date : 2008-01-21 DOI: 10.1109/IWIA.2008.14

T. Soga, T. Nanri, M. Kurokawa, K. Murakami

{"title":"Effect of Reordering Internal Messages in MPI Broadcast According to the Load Imbalance","authors":"T. Soga, T. Nanri, M. Kurokawa, K. Murakami","doi":"10.1109/IWIA.2008.14","DOIUrl":"https://doi.org/10.1109/IWIA.2008.14","url":null,"abstract":"To achieve higher scalability of parallel programs on large scale parallel computers, reducing the time spent for collective communications is one of the most important issue. In this paper, a dynamic optimization method to adjust the implementation of Broadcast operation, one of the most popular collective communications, is introduced.Though there have been many attempts to speed up this operation, they assume that each rank starts this operation at the same time. However, in real execution, the time can be different because of load-imbalance among ranks. This paper first claims that this difference can cause increase of the cost for this operation. Then, as a method to avoid this problem, an optimization method that adjusts the order of point-to-point messages in Broadcast operations is introduced. This method uses the wait time of each rank at the operation to determine the status of load-imbalance.From the results of experiments, it is shown that this optimization method can reduced the time for the operation.In addition to that, it is also shown that the effect of the optimization depends on the size of data to be broadcasted and the amount of load-imbalance.","PeriodicalId":220234,"journal":{"name":"2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116741459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Register File Reliability Analysis Through Cycle-Accurate Thermal Emulation 基于周期精确热仿真的寄存器文件可靠性分析

2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems Pub Date : 2008-01-01 DOI: 10.1109/IWIA.2008.7

J. Ayala, Pablo G Del Valle, David Atienza Alonso

{"title":"Register File Reliability Analysis Through Cycle-Accurate Thermal Emulation","authors":"J. Ayala, Pablo G Del Valle, David Atienza Alonso","doi":"10.1109/IWIA.2008.7","DOIUrl":"https://doi.org/10.1109/IWIA.2008.7","url":null,"abstract":"Continuous transistor scaling due to improvements in CMOS devices and manufacturing technologies is increasing processor power densities and temperatures; thus, creating challenges when trying to maintain manufacturing yield rates and devices which will be reliable throughout their lifetime. New microarchitectures require new reliability-aware design methods that can face these challenges without significantly increasing cost and performance. In this paper we present a complete analysis of reliability for the register file architecture of the Leon 3 processor. The analysis conducted is supported by the use of an accurate HW/SW FPGA-based emulation platform that enables a complete design space exploration of thermal and reliability metrics during the execution of an extended set of benchmarks, in a very limited amount of time. The effect of various compiler optimizations and register assignments on the reliability of the register file is then analyzed. Our results quantify the respective effects of these different factors and enable us to design a reliability-aware register file assignment policy that consistently improves the Mean-Time-To-Failure figure (20% on average) for the various types of applications.","PeriodicalId":220234,"journal":{"name":"2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123703821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Design and Power Performance Evaluation of On-Chip Memory Processor with Arithmetic Accelerators 带算术加速器的片上存储器处理器的设计与功耗性能评价

2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems Pub Date : 2007-08-01 DOI: 10.1109/IWIA.2008.9

C. Takahashi, M. Sato, D. Takahashi, T. Boku, A. Ukawa, Hiroshi Nakamura, Hidetaka Aoki, H. Sawamoto, N. Sukegawa

{"title":"Design and Power Performance Evaluation of On-Chip Memory Processor with Arithmetic Accelerators","authors":"C. Takahashi, M. Sato, D. Takahashi, T. Boku, A. Ukawa, Hiroshi Nakamura, Hidetaka Aoki, H. Sawamoto, N. Sukegawa","doi":"10.1109/IWIA.2008.9","DOIUrl":"https://doi.org/10.1109/IWIA.2008.9","url":null,"abstract":"In this paper, we design an on-chip memory processor with arithmetic accelerators, which are expected to improve power consumption. In addition, we evaluate the power performance of the processor. We propose implementing vector-type arithmetic accelerators and SIMD-type arithmetic accelerators in the on-chip memory processor. The evaluation results obtained using our simulator indicate that the performance of the 4FMAs SIMD-type accelerators is similar to that of the 4FMAs vector-type accelerators on DAXPY, Livermore kernel 1 and 3. However, the performance of the 4FMAs vector-type accelerator exceeds that of the 4FMAs SIMD-type accelerator with respect to matrix multiplication and QCD because of difference in element size of the registers. On Livermore kernel 7, the power performance of the 4FMAs SIMD-type accelerators exceeds that of the 4FMAs vector-type because of register reuse. However, the 16FMAs vector-type accelerators have an advantage in almost all simulations, excluding main memory bandwidth intensive benchmarks.","PeriodicalId":220234,"journal":{"name":"2008 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131045180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4