Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops最新文献_第2页

NVIDIA Grace Superchip Early Evaluation for HPC Applications 英伟达™（NVIDIA®）Grace 超级芯片针对高性能计算应用的早期评估

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops Pub Date : 2024-01-11 DOI: 10.1145/3636480.3637284

Fabio Banchelli, Joan Vinyals-Ylla-Catala, Josep Pocurull, Marc Clascà, Kilian Peiro, Filippo Spiga, M. Garcia-Gasulla, Filippo Mantovani

引用次数: 0

Using Intel oneAPI for Multi-hybrid Acceleration Programming with GPU and FPGA Coupling 使用英特尔 oneAPI 进行带有 GPU 和 FPGA 耦合的多混合加速编程

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops Pub Date : 2024-01-11 DOI: 10.1145/3636480.3637220

Wentao Liang, N. Fujita, Ryohei Kobayashi, T. Boku

引用次数: 0

MPI-Adapter2: An Automatic ABI Translation Library Builder for MPI Application Binary Portability MPI-Adapter2：用于 MPI 应用程序二进制可移植性的自动 ABI 转换库生成器

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops Pub Date : 2024-01-11 DOI: 10.1145/3636480.3637219

Shinji Sumimoto, Toshihiro Hanawa, Kengo Nakajima

{"title":"MPI-Adapter2: An Automatic ABI Translation Library Builder for MPI Application Binary Portability","authors":"Shinji Sumimoto, Toshihiro Hanawa, Kengo Nakajima","doi":"10.1145/3636480.3637219","DOIUrl":"https://doi.org/10.1145/3636480.3637219","url":null,"abstract":"This paper proposes an automatic MPI ABI (Application Binary Interface) translation library builder named MPI-Adapter2. The container-based job environment is becoming widespread in computer centers. However, when a user uses the container image in another computer center, the container with MPI binary may not work because of the difference in the ABI of MPI libraries. The MPI-Adapter2 enables to building of MPI ABI translation libraries automatically from MPI libraries. MPI-Adapter2 can build MPI ABI translation libraries not only between different MPI implementations, such as Open MPI, MPICH, and Intel MPI but also between different versions of MPI implementation. We have implemented and evaluated MPI-Adapter2 among several versions of Intel MPI, MPICH, MVAPICH, and Open MPI using NAS parallel benchmarks and pHEAT-3D, and found that MPI-Adapter2 worked fine except for Open MPI ver. 4 binary on Open MPI ver. 2 on IS of NAS parallel benchmarks, because of the difference in MPI object size. We also evaluated the pHEAT-3D binary compiled by Open MPI ver.5 using MPI-Adapter2 up to 1024 processes with 128 nodes. The performance overhead between MPI-Adapter2 and Intel native evaluation was 1.3%.","PeriodicalId":120904,"journal":{"name":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops","volume":"2 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139438128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Error-Energy Tradeoff in Molecular and Molecular-Continuum Fluid Simulations 分子和分子真空流体模拟中的误差-能量权衡

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops Pub Date : 2024-01-11 DOI: 10.1145/3636480.3636486

Amartya Das Sharma, Ruben Horn, Philipp Neumann

{"title":"The Error-Energy Tradeoff in Molecular and Molecular-Continuum Fluid Simulations","authors":"Amartya Das Sharma, Ruben Horn, Philipp Neumann","doi":"10.1145/3636480.3636486","DOIUrl":"https://doi.org/10.1145/3636480.3636486","url":null,"abstract":"Energy consumption plays a crucial role when designing simulation studies. In this work, we take a step towards modelling the relationship between statistical error and energy consumption for molecular and molecular-continuum flow simulations. After revisiting statistical error analysis and run time complexities for molecular dynamics (MD) simulations, we verify the respective relationships in stand-alone short-range MD simulations. We then extend the analysis to coupled molecular-continuum simulations, including the multi-instance (i.e., MD ensemble averaging) case, and additionally analyse the impact of noise filters. Our findings suggest that Gauss filters can reduce the statistical error to a similar degree as doubling the number of MD instances would. We further use regression to derive an analytical energy consumption model that predicts energy consumption on our HPC-cluster HSUper, to achieve simulation results at a prescribed statistical error (or gain in signal-to-noise ratio, respectively). All simulations were carried out using the MD software ls1 mardyn and the molecular-continuum coupling tool MaMiCo. However, the derived models are easily transferable to other pieces of software and other HPC platforms.","PeriodicalId":120904,"journal":{"name":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops","volume":"10 18","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139438265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Design and Preliminary Evaluation of OpenACC Compiler for FPGA with OpenCL and Stream Processing DSL 基于OpenCL和流处理DSL的FPGA OpenACC编译器设计与初步评价

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops Pub Date : 2020-01-15 DOI: 10.1145/3373271.3373274

Yutaka Watanabe, Jinpil Lee, K. Sano, T. Boku, M. Sato

{"title":"Design and Preliminary Evaluation of OpenACC Compiler for FPGA with OpenCL and Stream Processing DSL","authors":"Yutaka Watanabe, Jinpil Lee, K. Sano, T. Boku, M. Sato","doi":"10.1145/3373271.3373274","DOIUrl":"https://doi.org/10.1145/3373271.3373274","url":null,"abstract":"FPGA has emerged as one of the attractive computing devices in the post-Moore era because of its power efficiency and reconfigurability, even for future high-performance computing. We have designed an OpenACC compiler for FPGA to generate the kernel code by using stream processing Domain Specific Language (DSL) called SPGen, with OpenCL. Although, recently, the programming for FPGA has been improved dramatically by High-Level Synthesis (HLS) frameworks such as OpenCL and HLS C, yet it is still too difficult for HPC application developers, and the directive-based programming models such as OpenACC should be supported even for FPGA. OpenCL can be used as a portable intermediate code for OpenACC for FPGA. However, the generation of hardware from OpenCL is not easy to understand and therefore requires expert knowledge. SPGen is a DSL framework for generating stream processing HDL modules from the description of a dataflow graph. The advantage of our approach is that the code generation with SPGen enables more comprehensive low-level optimization in the OpenACC compiler. The preliminary evaluation results show that, for some kernels, the proposed method, which translates the OpenACC C code into OpenCL and SPGen codes, can perform optimization in the lower level more explicitly than the OpenCL-only method, which translates the OpenACC C code into the OpenCL code only. We also observed that more resources might be consumed in the proposed method. However, implementations of both methods are preliminary. We believe improving code generation will fix the problems such as high resource consumption.","PeriodicalId":120904,"journal":{"name":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124333703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

The Analysis of Inter-Process Interference on a Hybrid Memory System 混合存储系统的进程间干扰分析

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops Pub Date : 2020-01-15 DOI: 10.1145/3373271.3373272

S. Imamura, Eiji Yoshida

引用次数: 9

Parallel Multigrid Method on Multicore/Manycore Clusters 多核/多核集群的并行多网格方法

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops Pub Date : 2020-01-15 DOI: 10.1145/3373271.3373273

K. Nakajima

引用次数: 0

OpenCL-enabled GPU-FPGA Accelerated Computing with Inter-FPGA Communication 支持opencl的GPU-FPGA加速计算，fpga间通信

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops Pub Date : 2020-01-15 DOI: 10.1145/3373271.3373275

Ryohei Kobayashi, N. Fujita, Y. Yamaguchi, Ayumi Nakamichi, T. Boku

{"title":"OpenCL-enabled GPU-FPGA Accelerated Computing with Inter-FPGA Communication","authors":"Ryohei Kobayashi, N. Fujita, Y. Yamaguchi, Ayumi Nakamichi, T. Boku","doi":"10.1145/3373271.3373275","DOIUrl":"https://doi.org/10.1145/3373271.3373275","url":null,"abstract":"Field-programmable gate arrays (FPGAs) have garnered significant interest in high-performance computing research; their computational and communication capabilities have drastically improved in recent years owing to advances in semiconductor integration technologies. In addition to improving FPGA performance, toolchains for the development of FPGAs in OpenCL that reduce the amount of programming effort required have been developed and offered by FPGA vendors. These improvements reveal the possibility of implementing a concept that enables on-the-fly offloading of computational loads at which CPUs/GPUs perform poorly compared to FPGAs while moving data with low latency. We think that this concept is key to improving the performance of heterogeneous supercomputers that use accelerators such as the GPU. In this paper, we propose an approach for GPU--FPGA accelerated computing with the OpenCL programming framework that is based on the OpenCL-enabled GPU--FPGA DMA method and the FPGA-to-FPGA communication method. The experimental results demonstrate that our proposed method can enable GPUs and FPGAs to work together over different nodes.","PeriodicalId":120904,"journal":{"name":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115160620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops 亚太地区高性能计算国际会议论文集

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops Pub Date : 1900-01-01 DOI: 10.1145/3373271

引用次数: 1