2019 International Conference on High Performance Computing & Simulation (HPCS)最新文献_第6页

I/O Performance Evaluation of Large-Scale Deep Learning on an HPC System 高性能计算系统上大规模深度学习的I/O性能评估

2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188225

Minho Bae, Minjoong Jeong, Sangho Yeo, Sangyoon Oh, Oh-Kyoung Kwon

引用次数: 2

Simplifying the multi-GPU programming of a hyperspectral image registration algorithm 简化一种高光谱图像配准算法的多gpu编程

2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188064

Jorge Fernández-Fabeiro, Arturo González-Escribano, D. Ferraris

{"title":"Simplifying the multi-GPU programming of a hyperspectral image registration algorithm","authors":"Jorge Fernández-Fabeiro, Arturo González-Escribano, D. Ferraris","doi":"10.1109/HPCS48598.2019.9188064","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188064","url":null,"abstract":"Hyperspectral image registration is a relevant task for real-time applications like environmental disasters management or search and rescue scenarios. Traditional algorithms for this problem were not really devoted to real-time performance. The HYFMGPU algorithm arose as a high-performance GPU-based solution to solve such a lack. Nevertheless, a single-GPU solution is not enough, as sensors are evolving and then generating images with finer resolutions and wider wavelength ranges. An MPI+CUDA multi-GPU implementation of HYFMGPU was previously presented. However, this solution shows the programming complexity of combining MPI with an accelerator programming model. In this paper we present a new and more abstract programming approach for this type of applications, which provides a high efficiency while simplifying the programming of the multi-device parts of the code. The solution uses Hitmap, a library to ease the programming of parallel applications based on distributed arrays. It uses a more algorithm-oriented approach than MPI, including abstractions for the automatic partition and mapping of arrays at runtime with arbitrary granularity, as well as techniques to build flexible communication patterns that transparently adapt to the data partitions. We show how these abstractions apply to this application class. We present a comparison of development effort metrics between the original MPI implementation and the one based on Hitmap, with reductions of up to 95% for the Halstead score in specific work redistribution steps. We finally present experimental results showing that these abstractions are internally implemented in a high efficient way that can reduce the overall performance time in up to 37% comparing with the original MPI implementation.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126219491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Parallel construction of the Symbolic Observation Graph 符号观测图的并行构造

2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188053

Hiba Ouni, Kais Klai, Belhassen Zouari

引用次数: 0

Inward Fractal Dual Band High Gain Compact Antenna 向内分形双频高增益紧凑天线

2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188220

M. Madi, Maria Moussa, K. Kabalan

引用次数: 3

Assembly micro-benchmark generator for characterizing Floating Point Units 用于表征浮点单元的装配微基准生成器

2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188209

Jean Pourroy, P. Demichel, C. Denis

{"title":"Assembly micro-benchmark generator for characterizing Floating Point Units","authors":"Jean Pourroy, P. Demichel, C. Denis","doi":"10.1109/HPCS48598.2019.9188209","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188209","url":null,"abstract":"Making the right platform choice has always been a challenge for the HPC users no matter the applications vertical they are in. The number of references is very large and making the wrong choice can have adverse effects. Formerly users only had to choose between, for example, the different processors and interconnect vendors. Lately, due to the new Intel Skylake processors the choice has become increasingly difficult as different levels of performance are available within the same vendor platforms. To facilitate selection and give possible directions for the real benchmarked applications we introduce the Kernel Generator, an open source tool generating assembly kernels to help the programmer or the benchmarker understand the behavior of the different micro-architectures. We used our tool to study the behavior of the current micro-architectures and compare it to the current synthetic benchmarks which sometimes are not correctly characterizing a platform nor expose its strengths. The Kernel Generator facilitates the discovery of the platforms performance fit. To insure the relevance of our kernel, we are looking at Ansys Fluent behavior to explain the performance on the different Intel processors. In this case, we have that 4100 and 6100 Intel processors families can have equivalent performance on codes not well vectorized: Fluent being one of them. This demonstrates that we can use our tool for initial profiling and understanding of the different platforms.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131578678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analysis of a Self-Similar GPU Thread Map for Data-parallel m-Simplex Domains 数据并行m-单纯形域的自相似GPU线程映射分析

2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188081

C. Navarro, B. Bustos, N. Hitschfeld-Kahler

{"title":"Analysis of a Self-Similar GPU Thread Map for Data-parallel m-Simplex Domains","authors":"C. Navarro, B. Bustos, N. Hitschfeld-Kahler","doi":"10.1109/HPCS48598.2019.9188081","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188081","url":null,"abstract":"This work analyzes the possible performance benefits one could obtain by employing a Self-Similar type of GPU thread map on data-parallel m-simplex domains, which is the geometrical representation of several interaction problems. The main contributions of this work are (1) the proposal of a new block-space map H: $mathbb{Z}^{m}mapsto mathbb{Z}^{m}$ based on a self-similar set of sub-orthotopes, and (2) its analysis in terms of performance and thread space, from which we obtain that $mathcal{H}(omega)$ is time and space efficient for 2-simplices and only time efficient for 3-simplices unless the theoretical model is relaxed to allow concurrent parallel spaces. Experimental tests on a 2-simplex domain support the theoretical results, giving up to 30% of speedup over the standard approach. We also show how the map can utilize GPU tensor cores and further accelerate through fast matrix-multiply-accumulate operations. Finally, we show that extending the map to general m-simplices is a non-trivial optimization problem and depends of the choice of two parameters $r, beta$, for which we provide some insights in order to obtain a $mathcal{H}(omega)$ map that can be $m!$ times more space efficient than a bounding-box approach.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"249 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124737608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

High-Performance Computing for Formal Security Assessment 正式安全评估的高性能计算

2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188122

L. Spalazzi, Francesco Spegni

{"title":"High-Performance Computing for Formal Security Assessment","authors":"L. Spalazzi, Francesco Spegni","doi":"10.1109/HPCS48598.2019.9188122","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188122","url":null,"abstract":"Assessing the degree of security of a given system w.r.t. some attacker model and security policy can be done by means of formal methods. For instance, the system can be described as a Markov Decision Process, the security policy by means of a modal logic formula, PCTL⋆, and then a probabilistic model checker can return the probability with which the policy holds in the system. This methodology suffices when all the system parameters and their values are known a priori. On the other side, in case the degree of security of the system depends on the values of the system parameters, the formally security assessment task must output a probability function which takes the system parameters and returns the probability of a successful attack to the security of the system. One simple way to describe such function involves solving many instances of the probabilistic model checking problem, one for each combination of the parameter values. In this scenario, probabilistic model checking, which suffers from the state explosion problem, may become an unfeasible task for traditional workstations or even servers.In this work we introduce the tool SecMC which drives the user in the task of modeling the system under analysis and the required security policies, together with the parameters that affect them. Next, the user can specify the range of values assumed by the parameters, and the tool can take care of iterating the probabilistic model checking task, distributing the computations among different local or remote nodes of a cluster, and collect the results to produce a combined picture of how the level of security varies w.r.t. the parameter values.In this paper we show how the tool can be used in order to formally assess security of probabilistic systems known from the literature, viz. a probabilistic cryptographic protocol, a synchronization algorithm for wireless devices inspired by fireflies in nature, and the privacy of dispersed cloud storages.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133047512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Enhanced Autonomous Resource Selection Algorithm for Cooperative Awareness in Vehicular Communication 基于协同感知的车辆通信自主资源选择算法

2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188190

Brahmjit Singh, Sandeepika Sharma

{"title":"Enhanced Autonomous Resource Selection Algorithm for Cooperative Awareness in Vehicular Communication","authors":"Brahmjit Singh, Sandeepika Sharma","doi":"10.1109/HPCS48598.2019.9188190","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188190","url":null,"abstract":"With rapid development in wireless communication, Intelligent Transportations System (ITS) has received significant attention. This system delivers various social and economic benefits including efficient traffic management, lesser prone to accidents, reduced air pollution and enabling of unmanned driving for enhanced leisure. ITS is enabled through real-time communication either Vehicle-to-Vehicle (V2V) or Vehicle-to-Infrastructure (V2I) modes of intra-infrastructure communication. LTE-V2V is seen as a major technology towards ITS implementation. In this paper, we discuss various autonomous resource selection techniques available for Mode-4 V2V communication under LTE-A network. An enhanced resource selection scheme based on exponential averaging in time domain and normalized scaling in frequency domain is being proposed. Simulation results show the efficacy of the proposed algorithm in terms of packet reception ratio (PRR), error rate and update delay. The proposed scheme enables low latency transmission of cooperative awareness messages while maintaining high PRR and low error rate.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115049357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analyzing the data behavior of parallel application for extracting performance knowledge. 分析并行应用程序的数据行为，提取性能知识。

2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188166

F. Tirado, Alvaro Wong, Dolores Rexachs, E. Luque

{"title":"Analyzing the data behavior of parallel application for extracting performance knowledge.","authors":"F. Tirado, Alvaro Wong, Dolores Rexachs, E. Luque","doi":"10.1109/HPCS48598.2019.9188166","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188166","url":null,"abstract":"When performance tools are used to analyze an application with thousands of processes, the data generated can be bigger than the memory size of the cluster node, causing this data to be loaded in swap memory. In HPC systems, moving data to swap is not always an option. This problem causes scalability limitations that affect the user experience and it presents serious restrictions for executing on a large scale. In order to obtain knowledge about the application’s performance, the performance tools usually instrument the application to generate the data. When the instrumented parallel application is executed with thousands of processes, the data generated may be higher than the memory size of the compute node used to analyze the data in order to obtain the knowledge. Performance tools such as PAS2P predict the execution time in target machines. In order to predict the performance, PAS2P carries out a data analysis with the data in each application process. The data collected is analyzed sequentially, which results in an inefficient use of system resources. To solve this, we propose designing a parallel method to solve the problem when we manage a high volume of data, decreasing its execution time and increasing scalability, improving the PAS2P toolkit to generate performance knowledge defined by the application’s behavior phases.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134281996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Towards a Scalable and QoS-Aware Load Balancing Platform for Edge Computing Environments 面向边缘计算环境的可扩展和qos感知负载平衡平台

2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188159

Charafeddine Mechalikh, Hajer Taktak, Faouzi Moussa

{"title":"Towards a Scalable and QoS-Aware Load Balancing Platform for Edge Computing Environments","authors":"Charafeddine Mechalikh, Hajer Taktak, Faouzi Moussa","doi":"10.1109/HPCS48598.2019.9188159","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188159","url":null,"abstract":"Edge computing is a new computing paradigm that brings the cloud applications close to the Internet of Things (IoT) devices at the edge of the network. It improves the resources utilization efficiency by using the resources already available at the edge of the network [8]. As a result, it decreases the cloud workload, reduces the latency, and enables a new breed of latency-sensitive applications such as the connected vehicles. Horizontal scalability is another advantage of edge computing. Unlike the cloud and fog computing, the latter takes advantages of the growing number of connected devices, as this growth results in increasing the number of the available resources. Most researches in this field were only interested in finding the optimal tasks offloading destination by minimizing the latency, the resources utilization, and the energy consumption. Therefore, they ignore the effect of the synchronization between the devices, and the applications (i.e. containers) deployment delay. Motivated by the advantages of edge computing, in this paper, we introduce a load balancing platform for IoT-edge computing environments. As opposed to the current trend, we will first focus on the applications deployment and the synchronization between devices in order to provide better scalability, enable a self-manageable IoT network, and meet the quality of service (QoS). According to the simulation results, the proposed approach provides better scalability; it reduces the network utilization and the cloud workload. In addition, it provides better applications deployment delays and a lower latency.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133464585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4