2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)最新文献_第2页

Towards a Privacy-Aware Electric Vehicle Architecture 面向隐私感知的电动汽车架构

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2022-03-01 DOI: 10.1109/pdp55904.2022.00048

Christian Plappert, Jonathan Stancke, Lukas Jäger

引用次数: 0

GraphDEAR: An Accelerator Architecture for Exploiting Cache Locality in Graph Analytics Applications GraphDEAR:在图形分析应用程序中利用缓存局域性的加速架构

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2022-03-01 DOI: 10.1109/pdp55904.2022.00029

Siyi Hu, Masaaki Kondo, Yuan He, Ryuichi Sakamoto, Haotong Zhang, Jun Zhou, Hiroshi Nakamura

{"title":"GraphDEAR: An Accelerator Architecture for Exploiting Cache Locality in Graph Analytics Applications","authors":"Siyi Hu, Masaaki Kondo, Yuan He, Ryuichi Sakamoto, Haotong Zhang, Jun Zhou, Hiroshi Nakamura","doi":"10.1109/pdp55904.2022.00029","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00029","url":null,"abstract":"Data structure is the key in Edge Computing where various types of data are continuously generated by ubiquitous devices. Within all common data structures, graphs are used to express relationships and dependencies among human identities, objects, and locations; and they are expected to become one of the most important data infrastructure in the near future. Furthermore, as graph processing often requires random accesses to vast memory spaces, conventional memory hierarchies with caches cannot perform efficiently. To alleviate such memory access bottlenecks in graph processing, we present a solution through vertex accesses scheduling and edge array re-ordering, in parallel with the execution of graph processing application to improve both temporal and spatial locality of memory accesses, especially for edge-centric graphs which are popular means in handling dynamic graphs. Our proposed architecture is evaluated and tested through both trace-based cache simulations and cycle-accurate FPGA-based prototyping. Evaluation results show that our proposal has a potential of significantly reducing the quantity of Miss-Per-Kilo-Instructions (MPKI) for Last Level Cache (LLC) by 56.27% on average.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121368917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Towards Portable Realizations of Winograd-based Convolution with Vector Intrinsics and OpenMP 用矢量特性和OpenMP实现基于winograd的可移植卷积

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2022-03-01 DOI: 10.1109/pdp55904.2022.00015

M. F. Dolz, Adrián Castelló, E. S. Quintana‐Ortí

引用次数: 2

A Proposal of Mobility Support for the SimGrid Toolkit: Application to IoT simulations SimGrid工具包的移动性支持建议:应用于物联网模拟

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2022-03-01 DOI: 10.1109/pdp55904.2022.00035

Elías Del-Pozo-Puñal, Félix García Carballeira

引用次数: 0

Anatomy of the BLIS Family of Algorithms for Matrix Multiplication 矩阵乘法的BLIS算法族剖析

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2022-03-01 DOI: 10.1109/pdp55904.2022.00023

Adrián Castelló, E. S. Quintana‐Ortí, Francisco D. Igual

{"title":"Anatomy of the BLIS Family of Algorithms for Matrix Multiplication","authors":"Adrián Castelló, E. S. Quintana‐Ortí, Francisco D. Igual","doi":"10.1109/pdp55904.2022.00023","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00023","url":null,"abstract":"The efforts of the scientific community and hardware vendors to develop and optimize linear algebra codes have historically led to highly-tuned libraries, carefully adapted to the underlying processor architecture, with excellent (near-peak) performance. These optimization efforts, however, are commonly focused on obtaining the best performance possible when the involved operands are large and “squarish” matrices. New computationally-intensive applications (e.g., in deep learning) are increasingly demanding high-performance BLAS (Basic Linear Algebra Subprograms) also for small operands in any of their dimensions. In this paper, we tackle this problem by refactoring the general matrix-matrix multiplication (GEMM) algorithm within a specific high-performance implementation of BLAS, named BLIS, proposing a complete family of algorithmic variants to implement GEMM with different strategies to exploit the target cache hierarchy, together with the changes to be applied to architecture-specific codes to instantiate a complete GEMM implementation. Experimental results on an ARM processor (NVIDIA Carmel) reveal significant performance differences between the members of the GEMM family, depending on the shape and dimension of the matrix operands.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122676121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Decision Tree-Based Rule Derivation for Intrusion Detection in Safety-Critical Automotive Systems 基于决策树的汽车安全关键系统入侵检测规则推导

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2022-03-01 DOI: 10.1109/pdp55904.2022.00046

Lucas Buschlinger, Sanat Sarda, C. Krauß

引用次数: 2

A Parallel Approximation Algorithm for the Steiner Forest Problem Steiner森林问题的并行逼近算法

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2022-03-01 DOI: 10.1109/pdp55904.2022.00016

Laleh Ghalami, Daniel Grosu

{"title":"A Parallel Approximation Algorithm for the Steiner Forest Problem","authors":"Laleh Ghalami, Daniel Grosu","doi":"10.1109/pdp55904.2022.00016","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00016","url":null,"abstract":"In the Steiner Forest problem, we are given an undirected graph with non-negative weights for edges, a set of pairs of vertices, called terminals, and the goal is to find the minimum cost subgraph that connects each of the terminal pairs together. There exist several sequential heuristic and approximation algorithms for the Steiner Forest problem. In practice, the primal-dual 2-approximation algorithm is one of the fastest and obtains solutions that are very close to the optimal solution. In this paper, we design a practical parallel approximation algorithm based on the primal-dual sequential algorithm. The parallel algorithm maintains the approximation guarantees of the sequential primal-dual algorithm and it is specifically designed for execution on multi-core computers. We implement and run the parallel algorithm on a multi-core system with a large number of cores and perform an extensive experimental performance analysis on randomly generated graphs. The results show that our proposed parallel approximation algorithm achieves a significant speedup with respect to the sequential primal-dual algorithm.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114102765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploiting Vector Extennsions to Accelerate Time Series Analysis 利用向量扩展来加速时间序列分析

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2022-03-01 DOI: 10.1109/pdp55904.2022.00017

Ricardo Quislant, I. Fernandez, E. Serralvo, E. Gutiérrez, O. Plata

引用次数: 0

NoaSci: A Numerical Object Array Library for I/O of Scientific Applications on Object Storage NoaSci:用于对象存储科学应用的数字对象数组库

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2022-03-01 DOI: 10.1109/pdp55904.2022.00034

Steven W. D. Chien, Artur Podobas, Martin Svedin, A. Tkachuk, Salem El Sayed, Pawel Herman, G. Umanesan, Sai B. Narasimhamurthy, S. Markidis

{"title":"NoaSci: A Numerical Object Array Library for I/O of Scientific Applications on Object Storage","authors":"Steven W. D. Chien, Artur Podobas, Martin Svedin, A. Tkachuk, Salem El Sayed, Pawel Herman, G. Umanesan, Sai B. Narasimhamurthy, S. Markidis","doi":"10.1109/pdp55904.2022.00034","DOIUrl":"https://doi.org/10.1109/pdp55904.2022.00034","url":null,"abstract":"The strong consistency and stateful workflow are seen as the major factors for limiting parallel I/O performance because of the need for locking and state management. While the POSIX-based I/O model dominates modern HPC storage infrastructure, emerging object storage technology can potentially improve I/O performance by eliminating these bottlenecks. Despite a wide deployment on the cloud, its adoption in HPC remains low. We argue one reason is the lack of a suitable programming interface for parallel I/O in scientific applications. In this work, we introduce NoaSci, a Numerical Object Array library for scientific applications. NoaSci supports different data formats (e.g. HDF5, binary), and focuses on supporting nodelocal burst buffers and object stores. We demonstrate for the first time how scientific applications can perform parallel I/O on Seagate’s Motr object store through NoaSci. We evaluate NoaSci’s preliminary performance using the iPIC3D space weather application and position against existing I/O methods.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114303546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Clustering Datasets in Cloud Computing Environment for User Identification 云计算环境下聚类数据集的用户识别

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2022-03-01 DOI: 10.1109/pdp55904.2022.00033

Shallaw Mohammed Ali, G. Kecskeméti

引用次数: 1