2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)最新文献

筛选
英文 中文
Revisiting Credit Distribution Algorithms for Distributed Termination Detection 分布式终端检测的信用分配算法重审
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2021-06-01 DOI: 10.1109/IPDPSW52791.2021.00095
G. Bosilca, Aurélien Bouteiller, T. Hérault, Valentin Le Fèvre, Y. Robert, J. Dongarra
{"title":"Revisiting Credit Distribution Algorithms for Distributed Termination Detection","authors":"G. Bosilca, Aurélien Bouteiller, T. Hérault, Valentin Le Fèvre, Y. Robert, J. Dongarra","doi":"10.1109/IPDPSW52791.2021.00095","DOIUrl":"https://doi.org/10.1109/IPDPSW52791.2021.00095","url":null,"abstract":"This paper revisits distributed termination detection algorithms in the context of High-Performance Computing (HPC) applications. We introduce an efficient variant of the Credit Distribution Algorithm (CDA) and compare it to the original algorithm (HCDA) as well as to its two primary competitors: the Four Counters algorithm (4C) and the Efficient Delay-Optimal Distributed algorithm (EDOD). We analyze the behavior of each algorithm for some simplified task-based kernels and show the superiority of CDA in terms of the number of control messages.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"32 8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133740434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
CUDAMicroBench: Microbenchmarks to Assist CUDA Performance Programming cudammicrobench:辅助CUDA性能编程的微基准测试
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2021-06-01 DOI: 10.1109/IPDPSW52791.2021.00068
Xinyao Yi, D. Stokes, Yonghong Yan, C. Liao
{"title":"CUDAMicroBench: Microbenchmarks to Assist CUDA Performance Programming","authors":"Xinyao Yi, D. Stokes, Yonghong Yan, C. Liao","doi":"10.1109/IPDPSW52791.2021.00068","DOIUrl":"https://doi.org/10.1109/IPDPSW52791.2021.00068","url":null,"abstract":"Programming to achieve high performance for NVIDIA GPUs using CUDA has been known to be challenging. A GPU has hundreds or thousands of cores that a program must exhibit sufficient parallelism to achieve maximum GPU utilization. A system with GPU accelerators has a heterogeneous and deep memory system that programmers must effectively and correctly use to fully take advantage of the GPU’s parallelism capability. In this paper, we present CUDAMicroBench, a collection of fourteen microbenchmarks that demonstrate performance challenges in CUDA programming and techniques to optimize the CUDA programs to address these challenges. It also includes examples and techniques for using advanced CUDA features such as data shuffling between threads, dynamic parallelism, etc that can help users optimize the CUDA program for performance. The microbenchmark can be used for evaluating the performance of GPU architectures, the memory systems of GPU itself and of the whole system architectures, and for evaluating the effectiveness of compiler and performance tools for performance analysis. It can be used to help users understand the complexity of heterogeneous GPU-accelerator systems through examples and guide users for performance optimization. It is released as BSD-licensed open-source from https://github.com/passlab/CUDAMicroBench.git.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133417027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Performance Modeling and Tuning for DFT Calculations on Heterogeneous Architectures 异构架构下DFT计算的性能建模与调优
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2021-06-01 DOI: 10.1109/IPDPSW52791.2021.00108
H. Ahmed, David B. Williams-Young, K. Ibrahim, Chao Yang
{"title":"Performance Modeling and Tuning for DFT Calculations on Heterogeneous Architectures","authors":"H. Ahmed, David B. Williams-Young, K. Ibrahim, Chao Yang","doi":"10.1109/IPDPSW52791.2021.00108","DOIUrl":"https://doi.org/10.1109/IPDPSW52791.2021.00108","url":null,"abstract":"Tuning scientific code for heterogeneous computing architecture is a growing challenge. Not only do we need to tune the code to multiple architectures, but also we need to select or schedule computations to the most efficient compute variant. In this paper, we explore the tuning and performance modeling question of one of the most time computing kernels in density functional theory calculations on systems with a multicore host CPU accelerated with GPUs. We show the problem configuration dictates the choice of the most efficient compute engine. Such choice could alternate between the host and the accelerator, especially while scaling. As such, a performance model to predict the execution time on the host CPU and GPU is essential to select the compute environment and to achieve optimal performance. We present a simple model that empirically carry out such tasks and could accurately steer the scheduling of computation.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130567644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
GrAPL 2021 Keynote 1: Sparse Adjacency Matrices at the Core of Graph Databases: GraphBLAS the Engine Behind RedisGraph Property Graph Database graphl 2021主题演讲1:稀疏邻接矩阵在图数据库的核心:GraphBLAS背后的引擎RedisGraph属性图数据库
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2021-06-01 DOI: 10.1109/ipdpsw52791.2021.00044
{"title":"GrAPL 2021 Keynote 1: Sparse Adjacency Matrices at the Core of Graph Databases: GraphBLAS the Engine Behind RedisGraph Property Graph Database","authors":"","doi":"10.1109/ipdpsw52791.2021.00044","DOIUrl":"https://doi.org/10.1109/ipdpsw52791.2021.00044","url":null,"abstract":"","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125989309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DataVinci: Proactive Data Placement for Ad-Hoc Computing DataVinci: Ad-Hoc计算的主动数据放置
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2021-06-01 DOI: 10.1109/IPDPSW52791.2021.00129
Martin Breitbach, Janick Edinger, Dominik Schäfer, Christian Becker
{"title":"DataVinci: Proactive Data Placement for Ad-Hoc Computing","authors":"Martin Breitbach, Janick Edinger, Dominik Schäfer, Christian Becker","doi":"10.1109/IPDPSW52791.2021.00129","DOIUrl":"https://doi.org/10.1109/IPDPSW52791.2021.00129","url":null,"abstract":"Mobile ad-hoc computing enables applications to offload computationally intensive tasks to end-user devices in proximity. Many state-of-the-art applications such as face recognition, machine learning, or computer vision require large amounts of input data that is shared among multiple tasks. In these use cases, offloading the workload to remote devices becomes more time-consuming and, consequently, less attractive due to the required data transfer. As a solution, a proactive distribution of the data files on potential computational resource providers eliminates the need for ad-hoc data transfers. The characteristics of ad-hoc computing environments necessitate non-trivial data and task placement strategies. In this paper, we propose DataVinci — a data and task scheduler for mobile ad-hoc computing environments. DataVinci determines the number of copies for each data file (replicas), places these replicas proactively on remote devices, and schedules tasks based on the previously created data distribution. It continuously adjusts the number of replicas and balances the trade-off between execution latencies and data transfer overhead. In a large-scale study, we show the effectiveness of DataVinci, which reduces the average task execution time by more than 60 percent compared to an approach without proactive data placement, while keeping the amount of transferred data constant.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126050572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
ScaDL 2021 Invited Speaker-1 ScaDL 2021特邀演讲嘉宾1
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2021-06-01 DOI: 10.1109/ipdpsw52791.2021.00136
{"title":"ScaDL 2021 Invited Speaker-1","authors":"","doi":"10.1109/ipdpsw52791.2021.00136","DOIUrl":"https://doi.org/10.1109/ipdpsw52791.2021.00136","url":null,"abstract":"","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123551852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AsHES 2021 Keynote - Addressing Scalability Bottlenecks of DNN Training Through Hardware Heterogeneity: A View from the Perspectives of Memory Capacity and Energy Consumption 灰烬2021主题演讲-通过硬件异构解决深度神经网络训练的可扩展性瓶颈:从内存容量和能耗的角度来看
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2021-06-01 DOI: 10.1109/ipdpsw52791.2021.00073
{"title":"AsHES 2021 Keynote - Addressing Scalability Bottlenecks of DNN Training Through Hardware Heterogeneity: A View from the Perspectives of Memory Capacity and Energy Consumption","authors":"","doi":"10.1109/ipdpsw52791.2021.00073","DOIUrl":"https://doi.org/10.1109/ipdpsw52791.2021.00073","url":null,"abstract":"","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130179552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Shared-Memory Scalable k-Core Maintenance on Dynamic Graphs and Hypergraphs 动态图和超图的共享内存可伸缩k核维护
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2021-06-01 DOI: 10.1109/IPDPSW52791.2021.00158
Kasimir Gabert, Ali Pinar, Ümit V. Çatalyürek
{"title":"Shared-Memory Scalable k-Core Maintenance on Dynamic Graphs and Hypergraphs","authors":"Kasimir Gabert, Ali Pinar, Ümit V. Çatalyürek","doi":"10.1109/IPDPSW52791.2021.00158","DOIUrl":"https://doi.org/10.1109/IPDPSW52791.2021.00158","url":null,"abstract":"Computing k-cores on graphs is an important graph mining target as it provides an efficient means of identifying a graph’s dense and cohesive regions. Computing k-cores on hypergraphs has seen recent interest, as many datasets naturally produce hypergraphs. Maintaining k-cores as the underlying data changes is important as graphs are large, growing, and continuously modified. In many practical applications, the graph updates are bursty, both with periods of significant activity and periods of relative calm. Existing maintenance algorithms fail to handle large bursts, and prior parallel approaches on both graphs and hypergraphs fail to scale as available cores increase.We address these problems by presenting two parallel and scalable fully-dynamic batch algorithms for maintaining k-cores on both graphs and hypergraphs. Both algorithms take advantage of the connection between k-cores and h-indices. One algorithm is well suited for large batches and the other for small. We provide the first algorithms that experimentally demonstrate scalability as the number of threads increase while sustaining high change rates in graphs and hypergraphs.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130347138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
A New Double Rank-based Multi-workflow Scheduling with Multi-objective Optimization in Cloud Environments 云环境下基于双秩的多工作流多目标优化新方法
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2021-06-01 DOI: 10.1109/IPDPSW52791.2021.00015
Feng Li, Moon Gi Seok, Wentong Cai
{"title":"A New Double Rank-based Multi-workflow Scheduling with Multi-objective Optimization in Cloud Environments","authors":"Feng Li, Moon Gi Seok, Wentong Cai","doi":"10.1109/IPDPSW52791.2021.00015","DOIUrl":"https://doi.org/10.1109/IPDPSW52791.2021.00015","url":null,"abstract":"Workflow scheduling in clouds has been extensively researched. Many workflows from different users could be submitted to clouds at the same time and cloud providers should handle them simultaneously. So, it is necessary to consider the problem of scheduling multi-workflow. In addition, cloud computing systems can offer some special features, like Pay-Per-Use and Quality of Service (QoS) over the Internet. The scheduler has to consider the tradeoffs between different QoS parameters in order to satisfy the QoS requirements. Hence, how to schedule multiple heterogeneous workflows in the meanwhile to balance multiple objectives is a big challenge. The majority of the existing multi-workflow scheduling algorithms are based on QoS constrained approaches and attempt to optimize one objective while taking other QoS factors as constraints. Meanwhile, most of the multi-objective optimization scheduling works aim to deal with single-workflow. Conversely, this paper focuses on QoS optimization approaches by finding trade-off schedules to execute multi-workflow on cloud computing resources so as to balance multi-objective. To this end, a new double rank-based task sequencing method is proposed and integrated with a multi-objective heuristic algorithm for multi-workflow scheduling. Different algorithms are evaluated using various well-known real-world workflows and simulated workflows. The performance evaluation results demonstrate that the proposed approach is capable of generating efficient schedules with high quality in terms of meeting multi-objective for multiple workflows.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127256854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An Area-Efficient SPHINCS+ Post-Quantum Signature Coprocessor 面积高效的SPHINCS+后量子签名协处理器
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2021-06-01 DOI: 10.1109/IPDPSW52791.2021.00034
Quentin Berthet, A. Upegui, L. Gantel, Alexandre Duc, Giulia Traverso
{"title":"An Area-Efficient SPHINCS+ Post-Quantum Signature Coprocessor","authors":"Quentin Berthet, A. Upegui, L. Gantel, Alexandre Duc, Giulia Traverso","doi":"10.1109/IPDPSW52791.2021.00034","DOIUrl":"https://doi.org/10.1109/IPDPSW52791.2021.00034","url":null,"abstract":"The significant advances in the area of quantum computing of the past decade leave no doubt about the fact that quantum computers are an actual threat to cryptography. For this reason, a lot of efforts have been made lately in designing so-called post-quantum cryptographic primitives. The adoption of these schemes depends on the future capability of post-quantum cryptographic schemes to offer performances and functionalities similar to their classical counterparts. In particular, a milestone towards standardization is the implementation on FPGA of cryptographic primitives which leads to an efficient execution. We contribute in this respect by providing an area-efficient FPGA implementation of SPHINCS+, a post-quantum signature scheme which guarantees very high security, allowing its deployment into embedded systems such as hardware security modules, IoT devices or nanosatellites.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129717661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信