2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)最新文献_第5页

Instrumental Data Management and Scientific Workflow Execution: the CEA Case Study 仪器数据管理和科学工作流执行:CEA案例研究

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2019-05-01 DOI: 10.1109/IPDPSW.2019.00139

F. Boito, J. Méhaut, T. Deutsch, B. Videau, F. Desprez

{"title":"Instrumental Data Management and Scientific Workflow Execution: the CEA Case Study","authors":"F. Boito, J. Méhaut, T. Deutsch, B. Videau, F. Desprez","doi":"10.1109/IPDPSW.2019.00139","DOIUrl":"https://doi.org/10.1109/IPDPSW.2019.00139","url":null,"abstract":"In this paper, we study a typical scenario in research facilities. Instrumental data is generated by lab equipment such as microscopes, collected by researchers into USB devices, and analyzed in their own computers. In this scenario, an instrumental data management framework could store data in a institution-level storage infrastructure and allow to execute tasks to analyze this data in some available processing nodes. This setup has the advantages of promoting reproducible research and the efficient usage of the expensive lab equipment (in addition to increasing researchers productivity). We detail the requirements for such a framework regarding the needs of our case study of the CEA, review existing solutions and recommend the choice of Galaxy. We then analyze the performance limitations of the proposed architecture, and point to the connection between centralized storage and the processing nodes as the critical point. We also conduct a performance evaluation over an experimental platform to observe the limitations encountered in practice. We finish by pointing issues that are not addressed by existing solutions, and are therefore future work perspectives for the research field.","PeriodicalId":292054,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116415853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluation of Circuits on the Reconfigurable Mesh 可重构网格上电路的评估

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2019-05-01 DOI: 10.1109/IPDPSW.2019.00020

Y. Ben-Asher, Esti Stein

引用次数: 0

FIFO-Based Hardware Sorters for High Bandwidth Memory 基于fifo的高带宽内存硬件分选器

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2019-05-01 DOI: 10.1109/IPDPSW.2019.00112

K. Nakano, Yasuaki Ito, J. Bordim

{"title":"FIFO-Based Hardware Sorters for High Bandwidth Memory","authors":"K. Nakano, Yasuaki Ito, J. Bordim","doi":"10.1109/IPDPSW.2019.00112","DOIUrl":"https://doi.org/10.1109/IPDPSW.2019.00112","url":null,"abstract":"The main contribution of this paper is to show efficient FIFO-based hardware sorters that sort n elements with w bits each stored in a high bandwidth memory with modest access latency. We assume that each address of the high bandwidth memory can store p elements of w bits each, which can be read or written at the same time. The access latency l of the high bandwidth memory is assumed to take l clock cycles to access p elements in a specified address. Furthermore, burst mode is supported and k (≥ 1) consecutive addresses can be accessed in k+l-1 clock cycles in a pipeline fashion. However, if k addresses are not consecutive, kl clock cycles are necessary to access all of them. Clearly, all n elements arranged n/p addresses can be duplicated in 2(n/p+l-1) clock cycles. We present two types of hardware sorters that sort n=rc elements stored in an r×c matrix of the high bandwidth memory. We first develop Three-Pass-Sort and Four-Pass-Sort that sort an r×c matrix by reading from and witting in it three times and four times, respectively. We implement these two algorithms using FIFO-based mergers that can be configured as pairwise mode and sliding mode. Our hardware sorter based on Three-Pass-Sort runs in 6n/p+3c^2/p^2l+O(c/p(l+log r)+r) clock cycles using a circuit of size O(rwp) provided that r≥c^2. Also, our hardware sorter based on Four-Pass-Sort runs in 8n/p+2c^2l+O(cl+log r+p) clock cycles using a circuit of size O(rw).","PeriodicalId":292054,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127152603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Reinforcement Learning Scheduling Strategy for Parallel Cloud-Based Workflows 基于云的并行工作流的强化学习调度策略

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2019-05-01 DOI: 10.1109/IPDPSW.2019.00134

André Nascimento, Victor Olimpio, V. Silva, A. Paes, Daniel de Oliveira

{"title":"A Reinforcement Learning Scheduling Strategy for Parallel Cloud-Based Workflows","authors":"André Nascimento, Victor Olimpio, V. Silva, A. Paes, Daniel de Oliveira","doi":"10.1109/IPDPSW.2019.00134","DOIUrl":"https://doi.org/10.1109/IPDPSW.2019.00134","url":null,"abstract":"Scientific experiments can be modeled as Workflows. Such Workflows are usually computing-and data-intensive, demanding the use of High-Performance Computing environments such as clusters, grids, and clouds. This latter offers the advantage of elasticity, which allows for increasing and/or decreasing the number of Virtual Machines (VMs) on demand. Workflows are typically managed using Scientific Workflow Management Systems (SWfMS). Many existing SWfMSs offer support for cloud-based execution. Each SWfMS has its own scheduler that follows a well-defined cost function. However, such cost functions must consider the characteristics of a dynamic environment, such as live migrations and/or performance fluctuations, which are far from trivial to model. This paper proposes a novel scheduling strategy, named ReASSIgN, based on Reinforcement Learning (RL). By relying on an RL technique, one may assume that there is an optimal (or sub-optimal) solution for the scheduling problem, and aims at learning the best scheduling based on previous executions in the absence of a mathematical model of the environment. For this, an extension of a well-known workflow simulator WorkflowSim is proposed to implement an RL strategy for scheduling workflows. Once the scheduling plan is generated, the workflow is executed in the cloud using SciCumulus SWfMS. We conducted a thorough evaluation of the proposed scheduling strategy using a real astronomy workflow.","PeriodicalId":292054,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129919068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Approximate and Exact Selection on GPUs gpu的近似和精确选择

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2019-05-01 DOI: 10.1109/IPDPSW.2019.00088

T. Ribizel, H. Anzt

引用次数: 2

Teaching High Performance Computing through Parallel Programming Marathons 通过并行编程马拉松来教授高性能计算

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2019-05-01 DOI: 10.1109/IPDPSW.2019.00058

L. A. J. Marzulo, Calebe P. Bianchini, Leandro Santiago, V. C. Ferreira, Brunno F. Goldstein, F. França

{"title":"Teaching High Performance Computing through Parallel Programming Marathons","authors":"L. A. J. Marzulo, Calebe P. Bianchini, Leandro Santiago, V. C. Ferreira, Brunno F. Goldstein, F. França","doi":"10.1109/IPDPSW.2019.00058","DOIUrl":"https://doi.org/10.1109/IPDPSW.2019.00058","url":null,"abstract":"Parallel and distributed programming is essential for exploiting the processing power of modern computing platforms. However, during the first years of a Computer Science course, students usually learn problem solving techniques, data structures and programming paradigms that are inherently sequential, hindering the transition to parallel architectures. Parallel Programming Marathons organized in Brazil are similar to other Programming Competitions around the world and have been used for teaching and stimulating undergraduate and graduate students into learning to \"think in parallel\" and to develop applications for different parallel architectures, including multicores, clusters and accelerators. This paper presents the structure of this Parallel Programming Marathon and an overview of how it supports regional and national contests. Also, this work presents use cases on Parallel and Distributed Computing course from two different Brazilian universities that use a challenge based learning approach and employ marathon problems as course assignments. This approach contributed to increase students' interest towards High Performance Computing.","PeriodicalId":292054,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116698228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

BRICS – Efficient Techniques for Estimating the Farness-Centrality in Parallel 金砖国家——平行估算法中心性的有效技术

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2019-05-01 DOI: 10.1109/IPDPSW.2019.00110

Sai Charan Regunta, Sai Harsh Tondomker, Kishore Kothapalli

引用次数: 2

Are we Doing the Right Thing? — A Critical Analysis of the Academic HPC Community 我们在做正确的事吗?-对学术高性能计算社区的批判性分析

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2019-05-01 DOI: 10.1109/IPDPSW.2019.00122

H. Anzt, Goran Flegar

{"title":"Are we Doing the Right Thing? — A Critical Analysis of the Academic HPC Community","authors":"H. Anzt, Goran Flegar","doi":"10.1109/IPDPSW.2019.00122","DOIUrl":"https://doi.org/10.1109/IPDPSW.2019.00122","url":null,"abstract":"Like in any other research field, academically surviving in the High Performance Computing (HPC) community generally requires to publish papers, in the bast case many of them and in high-ranked journals or at top-tier conferences. As a result, the number of scientific papers published each year in this relatively small community easily outnumbers what a single researcher can read. At the same time, many of the proposed and analyzed strategies, algorithms, and hardware-optimized implementations never make it beyond the prototype stage, as they are abandoned once they served the single purpose of yielding (another) publication. In a time and field where high-quality manpower is a scarce resource, this is extremely inefficient. In this position paper we promote a radical paradigm shift towards accepting high-quality software patches to community software packages as legitimate conference contributions. In consequence, the reputation and appointability of researchers is no longer based on the classical scientific metrics, but on the quality and documentation of open source software contributions — effectively improving and accelerating the collaborative development of community software.","PeriodicalId":292054,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130517162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Heterogeneous Active Messages for Offloading on the NEC SX-Aurora TSUBASA 在NEC SX-Aurora TSUBASA上卸载异构活动消息

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2019-05-01 DOI: 10.1109/IPDPSW.2019.00014

M. Noack, E. Focht, T. Steinke

{"title":"Heterogeneous Active Messages for Offloading on the NEC SX-Aurora TSUBASA","authors":"M. Noack, E. Focht, T. Steinke","doi":"10.1109/IPDPSW.2019.00014","DOIUrl":"https://doi.org/10.1109/IPDPSW.2019.00014","url":null,"abstract":"The NEC SX-Aurora TSUBASA is a new generation of vector processing architectures that combines a standard Intel Xeon host with the newly developed NEC Vector Engine coprocessor cards. One way to use these coprocessors is offloading suitable parts of the program from the host to the Vector Engines. Currently, the only vendor-provided offloading solutions are the low-level Vector Engine Offloading (VEO) library, and a builtin reverse-offloading mechanism named VHcall. In this work, we extend the portable Heterogeneous Active Messages (HAM) based HAM-Offload framework with support for the NEC SX-Aurora TSUBASA. Therefore, we design, implement, and evaluate two messaging protocols aimed at minimising offloading cost. This sheds some light on how to achieve fast communication between host CPU and the Vector Engines of the NEC SX-Aurora TSUBASA. Compared with VEO, the DMA-based protocol reduces offloading overhead by a factor of 13×. The resulting framework enables users to write portable offload applications with low overhead, that do neither require a language extension like OpenMP, nor a special language like OpenCL. Existing HAM-Offload applications are now ready to run on the NEC SX-Aurora TSUBASA.","PeriodicalId":292054,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130762110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

A Fast Local Algorithm for Track Reconstruction on Parallel Architectures 并行结构下航迹重建的快速局部算法

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2019-05-01 DOI: 10.1109/IPDPSW.2019.00118

D. C. Pérez, N. Neufeld, A. Riscos-Núñez

{"title":"A Fast Local Algorithm for Track Reconstruction on Parallel Architectures","authors":"D. C. Pérez, N. Neufeld, A. Riscos-Núñez","doi":"10.1109/IPDPSW.2019.00118","DOIUrl":"https://doi.org/10.1109/IPDPSW.2019.00118","url":null,"abstract":"The reconstruction of particle trajectories, tracking, is a central process in the reconstruction of particle collisions in High Energy Physics detectors. At the LHCb detector in the Large Hadron Collider, bunches of particles collide 30 million times per second. These collisions produce about 10^9 particle trajectories per second that need to be reconstructed in real time, in order to filter and store data. Upcoming improvements in the LHCb detector will deprecate the hardware filter in favour of a full software filter, posing a computing challenge that requires a renovation of current algorithms and the underlying hardware. We present Search by triplet, a local tracking algorithm optimized for parallel architectures. We design our algorithm reducing Read-After-Write dependencies as well as conditional branches, incrementing the potential for parallelization. We analyze the complexity of our algorithm and validate our results. We show the scaling of our algorithm for an increasing number of collision events. We show sustained tests for our algorithm sequence given a simulated dataflow. We develop CPU and GPU implementations of our work, and hide the transmission times between device and host by executing a multi-stream pipeline. Our results provide a reliable basis for an informed assessment on the feasibility of LHCb event reconstruction on parallel architectures, enabling us to develop cost models for upcoming technology upgrades. The created software infrastructure is extensible and permits the addition of subsequent reconstruction algorithms.","PeriodicalId":292054,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128371860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7