2022 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)最新文献

筛选

英文中文

Implementing the Broadcast Operation in a Distributed Task-based Runtime 在基于分布式任务的运行时中实现广播操作

2022 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2022-11-01 DOI: 10.1109/SBAC-PADW56527.2022.00014

Rodrigo Ceccato, H. Yviquel, M. Pereira, Alan Souza, G. Araújo

{"title":"Implementing the Broadcast Operation in a Distributed Task-based Runtime","authors":"Rodrigo Ceccato, H. Yviquel, M. Pereira, Alan Souza, G. Araújo","doi":"10.1109/SBAC-PADW56527.2022.00014","DOIUrl":"https://doi.org/10.1109/SBAC-PADW56527.2022.00014","url":null,"abstract":"Scientific applications that require high performance rely on multi-node and multi-core systems equipped with accelerators. Code for these heterogeneous architectures often mixes different programming paradigms and is hard to read and maintain. Task-based distributed runtimes can improve portability and readability by allowing programmers to write tasks that are automatically scheduled and offloaded for execution. Nevertheless, in large systems, communication can dominate the time spent during execution. To mitigate this, these systems usually implement collective operation algorithms that efficiently execute common data movement patterns across a group of processes. This work studies the usage of different broadcast strategies on the OpenMP Cluster (OMPC) task-based runtime. In addition to OMPC default behavior of on-demand data delivery, we introduce a routine to automatically detect data movement that is equivalent to a broadcast in the task graph and actively send it through a specialized algorithm. Our largest test setup using 64 worker nodes and broadcasting 64GB of data on the Santos Dumont cluster, using an extended version of Task Bench, showed a 2.02x speedup using the Dynamic Broadcast algorithm and 2.49x speedup when using the MPI broadcast routine, when compared to the default on-demand delivery.","PeriodicalId":263889,"journal":{"name":"2022 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130373704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An OpenMP-only Linear Algebra Library for Distributed Architectures 一个OpenMP-only线性代数库的分布式架构

2022 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2022-11-01 DOI: 10.1109/SBAC-PADW56527.2022.00013

Carla Cardoso, H. Yviquel, G. Valarini, Gustavo Leite, Rodrigo Ceccato, M. Pereira, Alan Souza, G. Araújo

引用次数: 0

I/O performance of multiscale finite element simulations on HPC environments 高性能计算环境下多尺度有限元模拟的I/O性能

2022 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2022-11-01 DOI: 10.1109/SBAC-PADW56527.2022.00012

F. Boito, A. T. Gomes, Louis Peyrondet, Luan Teylo

引用次数: 0

首页上一页