2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)最新文献_第10页

A Locality-aware Cooperative Distributed Memory Caching for Parallel Data Analytic Applications 面向并行数据分析应用的位置感知协同分布式内存缓存

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2022-05-01 DOI: 10.1109/IPDPSW55747.2022.00183

Chia–Ting Hung, J. Chou, Ming-Hung Chen, I. Chung

{"title":"A Locality-aware Cooperative Distributed Memory Caching for Parallel Data Analytic Applications","authors":"Chia–Ting Hung, J. Chou, Ming-Hung Chen, I. Chung","doi":"10.1109/IPDPSW55747.2022.00183","DOIUrl":"https://doi.org/10.1109/IPDPSW55747.2022.00183","url":null,"abstract":"Memory caching has long been used to fill up the performance gap between processor and disk for reducing the data access time of data-intensive computations. Previous studies on caching mostly focus on optimizing the hit rate of a single machine. But in this paper, we argue that the caching decision of a distributed memory system should be performed in a cooperative manner for the parallel data analytic applications, which are commonly used by emerging technologies, such as Big Data and AI (Artificial Intelligence), to perform data mining and sophisticated analytics on larger data volume in a shorter time. A parallel data analytic job consists of multiple parallel tasks. Hence, the completion time of a job is bounded by its slowest task, meaning that the job cannot benefit from caching until all inputs of its tasks are cached. To address the problem, we proposed a cooperative caching design that periodically rearranges the cache placement among nodes according to the data access pattern while taking the task dependency and network locality into account. Our approach is evaluated by a trace-driven simulator using both synthetic workload and real-world traces. The results show that we can reduce the average completion times up to 33% compared to a non-collaborative caching polices and 25% compared to other start-of-the-art collaborative caching policies.","PeriodicalId":286968,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121684445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CORtEX 2022 Invited Speaker 4: Large-scale simulations of mammalian brains using peta- to exa-scale computing 2022年皮层特邀演讲者4:使用peta到exa-scale计算大规模模拟哺乳动物大脑

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2022-05-01 DOI: 10.1109/IPDPSW55747.2022.00216

J. Igarashi

引用次数: 0

Distributed Algorithms for the Graph Biconnectivity and Least Common Ancestor Problems 图的双连通和最小共同祖先问题的分布式算法

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2022-05-01 DOI: 10.1109/IPDPSW55747.2022.00187

Ian Bogle, George M. Slota

引用次数: 1

Modeling Power Consumption of Lossy Compressed I/O for Exascale HPC Systems 百亿亿级HPC系统有损压缩I/O功耗建模

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2022-05-01 DOI: 10.1109/IPDPSW55747.2022.00184

Grant Wilkins, Jon C. Calhoun

{"title":"Modeling Power Consumption of Lossy Compressed I/O for Exascale HPC Systems","authors":"Grant Wilkins, Jon C. Calhoun","doi":"10.1109/IPDPSW55747.2022.00184","DOIUrl":"https://doi.org/10.1109/IPDPSW55747.2022.00184","url":null,"abstract":"Exascale computing enables unprecedented, detailed and coupled scientific simulations which generate data on the order of tens of petabytes. Due to large data volumes, lossy compressors become indispensable as they enable better compression ratios and runtime performance than lossless compressors. Moreover, as (high-performance computing) HPC systems grow larger, they draw power on the scale of tens of megawatts. Data motion is expensive in time and energy. Therefore, optimizing compressor and data I/O power usage is an important step in reducing energy consumption to meet sustainable computing goals and stay within limited power budgets. In this paper, we explore efficient power consumption gains for the SZ and ZFP lossy compressors and data writing on a cloud HPC system while varying the CPU frequency, scientific data sets, and system architecture. Using this power consumption data, we construct a power model for lossy compression and present a tuning methodology that reduces energy overhead of lossy compressors and data writing on HPC systems by 14.3% on average. We apply our model and find 6.5 kJ s, or 13 %, of savings on average for 512GB I/O. Therefore, utilizing our model results in more energy efficient lossy data compression and I/O.","PeriodicalId":286968,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126935991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Methodology to Build Decision Analysis Tools Applied to Distributed Reinforcement Learning 一种构建用于分布式强化学习的决策分析工具的方法

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2022-05-01 DOI: 10.1109/IPDPSW55747.2022.00173

Cèdric Prigent, Loïc Cudennec, Alexandru Costan, Gabriel Antoniu

{"title":"A Methodology to Build Decision Analysis Tools Applied to Distributed Reinforcement Learning","authors":"Cèdric Prigent, Loïc Cudennec, Alexandru Costan, Gabriel Antoniu","doi":"10.1109/IPDPSW55747.2022.00173","DOIUrl":"https://doi.org/10.1109/IPDPSW55747.2022.00173","url":null,"abstract":"As Artificial Intelligence-based applications become more and more complex, speeding up the learning phase (which is typically computation-intensive) becomes more and more necessary. Distributed machine learning (ML) appears adequate to address this problem. Unfortunately, ML also brings new development frameworks, methodologies and high-level program-ming languages that do not fit to the regular high-performance computing design flow. This paper introduces a methodology to build a decision making tool that allows ML experts to arbitrate between different frameworks and deployment configurations, in order to fulfill project objectives such as the accuracy of the resulting model, the computing speed or the energy consumption of the learning computation. The proposed methodology is applied to an industrial-grade case study in which reinforcement learning is used to train an autonomous steering model for a cargo airdrop system. Results are presented within a Pareto front that lets ML experts choose an appropriate solution, a framework and a deployment configuration, based on the current operational situation. While the proposed approach can effortlessly be applied to other machine learning problems, as for many decision making systems, the selected solutions involve a trade-off between several antagonist evaluation criteria and require experts from different domains to pick the most efficient solution from the short list. Nevertheless, this methodology speeds up the development process by clearly discarding, or, on the contrary, including combinations of frameworks and configurations, which has a significant impact for time and budget-constrained projects.","PeriodicalId":286968,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125910431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Customizable Lightweight STM for Irregular Algorithms on GPU 针对GPU上不规则算法的可定制轻量级STM

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2022-05-01 DOI: 10.1109/IPDPSW55747.2022.00098

Shayan Manoochehri, Patrick Cristofaro, D. Goswami

{"title":"A Customizable Lightweight STM for Irregular Algorithms on GPU","authors":"Shayan Manoochehri, Patrick Cristofaro, D. Goswami","doi":"10.1109/IPDPSW55747.2022.00098","DOIUrl":"https://doi.org/10.1109/IPDPSW55747.2022.00098","url":null,"abstract":"Irregular algorithms are often encountered in highly data-centric application domains. These algorithms operate on irregular data structures such as sparse graphs with irregular access patterns, which may also modify the underlying topology unpredictably. High computational time and inherent data parallelism present in these algorithms motivate the use of GPUs for speeding things up, however there are challenges for their efficient implementations due to: difficulty in protecting the shared data consistency in the presence of concurrent dynamic transactions; irregular access patterns due to unstructured data structures; and dynamic structural modifications of the underlying topology. One approach to overcome these challenges is to use Software Transactional Memory (STM). However, overly complex design and implementations of contemporary STM-based approaches and lack of proper framework to employ them in conjunction with the irregular algorithms stalls their adoption by the programming community. To overcome some of these challenges, this research proposes a lightweight STM with a simple design (Lite GSTM), based on a lock stealing algorithm, and an associated extensible framework to hide the complexity of the STM from a programmer. The framework is extensible by allowing plug-ins of customized STMs designed for different needs of transactions. The use of the framework is elaborated with two use cases which employ completely different irregular algorithms, however, have some common features: the underlying data structure is a graph, and the graph is structurally modified (coarsened) unpredictably in the course of execution. The paper presents the performance comparisons of the STM-based implementations with respect to their sequential and non-STM based counterparts, which show promising results.","PeriodicalId":286968,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126558746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automatic Parallelization of Programs via Software Stream Rewriting 通过软件流重写实现程序的自动并行化

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2022-05-01 DOI: 10.1109/IPDPSW55747.2022.00094

Tao Tao, D. Plaisted

引用次数: 1

Towards a GraphBLAS Implementation for Go 面向Go的GraphBLAS实现

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2022-05-01 DOI: 10.1109/IPDPSW55747.2022.00052

Pascal Costanza, I. Hur, T. Mattson

引用次数: 1

ReconOS64: A Hardware Operating System for Modern Platform FPGAs with 64-Bit Support ReconOS64:支持64位的现代平台fpga的硬件操作系统

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2022-05-01 DOI: 10.1109/IPDPSW55747.2022.00029

L. Clausing, M. Platzner

引用次数: 2

Modeling Memory Contention between Communications and Computations in Distributed HPC Systems 分布式高性能计算系统中通信与计算内存争用建模

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2022-05-01 DOI: 10.1109/IPDPSW55747.2022.00086

Alexandre Denis, E. Jeannot, Philippe Swartvagher

引用次数: 2