2019 International Conference on High Performance Computing & Simulation (HPCS)最新文献

筛选
英文 中文
Configuring Graph Traversal Applications for GPUs: Analysis of Implementation Strategies and their Correlation with Graph Characteristics 配置图形处理器的图形遍历应用:实现策略及其与图形特性的相关性分析
2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188204
F. Busato, N. Bombieri
{"title":"Configuring Graph Traversal Applications for GPUs: Analysis of Implementation Strategies and their Correlation with Graph Characteristics","authors":"F. Busato, N. Bombieri","doi":"10.1109/HPCS48598.2019.9188204","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188204","url":null,"abstract":"Implementing a graph traversal (GT) algorithm for GPUs is a very challenging task. It is a core primitive for many graph analysis applications and its efficiency strongly impacts on the overall application performance. Different strategies have been proposed to implement the GT algorithm by exploiting the GPU characteristics. Nevertheless, the efficiency of each of them strongly depends on the graph characteristics. This paper presents an analysis of the most important features of the parallel GT algorithm, which include frontier queue management, load balancing, duplicate removing, and synchronization during graph traversal iterations. It shows different techniques to implement each of such features for GPUs and the comparison of their performance when applied on a very large and heterogeneous set of graphs. The results allow identifying, for each feature and among different implementation techniques of them, the best configuration to address the graph characteristics. The paper finally presents how such a configuration analysis and set allow traversing graphs with throughput up to 14,000 MTEPS on single GPU devices, with speedups ranging from 1.2x to 18.5x with regard to the best parallel applications for GT on GPUs at the state of the art.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"73 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126108310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PCS: A Productive Computational Science Platform PCS:一个多产的计算科学平台
2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188108
David Ojika, A. Gordon-Ross, H. Lam, Shinjae Yoo, Younggang Cui, Zhihua Dong, K. V. Dam, Seyong Lee, T. Kurth
{"title":"PCS: A Productive Computational Science Platform","authors":"David Ojika, A. Gordon-Ross, H. Lam, Shinjae Yoo, Younggang Cui, Zhihua Dong, K. V. Dam, Seyong Lee, T. Kurth","doi":"10.1109/HPCS48598.2019.9188108","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188108","url":null,"abstract":"As modern supercomputers continue to be increasingly heterogeneous with diverse computational accelerators (graphics processing units (GPUs), fieldprogrammable gate arrays (FPGAs), application specific integrated circuits (ASICs), etc.), software becomes a critical design aspect. Exploiting this new computational power requires increased software design time and effort to make valuable scientific discovery in the face of the complicated programming environments introduced by these accelerators. To address these challenges, we propose unifying multiple programming models into a single programming environment to facilitate large-scale, accelerator-aware, heterogeneous computing for next-generation scientific applications. This paper presents PCS, a productive computational science platform for cluster-scale heterogeneous computing. Focusing FPGAs, we describe the key concepts of the PCS platform and differentiate PCS from the current state-of-the-art, propose a new multi-FPGA architecture for graph-centric workloads (e.g., deep learning, etc.) with discussions on ongoing work.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127123818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Enhancing Reliability of Compute Environments on Amazon EC2 Spot Instances 增强Amazon EC2 Spot实例上计算环境的可靠性
2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188116
Altino M. Sampaio, Jorge G. Barbosa
{"title":"Enhancing Reliability of Compute Environments on Amazon EC2 Spot Instances","authors":"Altino M. Sampaio, Jorge G. Barbosa","doi":"10.1109/HPCS48598.2019.9188116","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188116","url":null,"abstract":"Amazon Elastic Compute Cloud (EC2) gives access to resources in the form of virtual servers, also known as instances. EC2 Spot Instances (SIs) offer spare compute capacity at steep discounts compared to reliable and fixed price on-demand instances. The drawback, however, is that waiting time until requested spots become fulfilled can be incredible high. In this paper, we propose a container migration-based solution to enhance the reliability of virtual cluster computing environments built on top of non-reserved EC2 pricing model instances. We compare the performance of our algorithm by executing different resource provisioning plans for running real-life workflow applications, constrained by user-defined deadline and budget Quality of Service (QoS) parameters. The results show that our solution is able to successfully conclude almost 98% of workflow applications and more than 99% of workflow tasks for on-demand- and spot block-based virtual compute environments. For SI-based virtual compute environments, our solution achieves similar results, completing more than 98% of workflow applications, and over 99% of workflow tasks, for a worse-case scenario.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127533251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-Parameter Performance Modeling using Symbolic Regression 基于符号回归的多参数性能建模
2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188202
Sai P. Chenna, G. Stitt, H. Lam
{"title":"Multi-Parameter Performance Modeling using Symbolic Regression","authors":"Sai P. Chenna, G. Stitt, H. Lam","doi":"10.1109/HPCS48598.2019.9188202","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188202","url":null,"abstract":"Performance modeling is becoming critically important due to the need for design-space exploration on emerging exascale architectures. Existing modeling and prediction approaches are either restricted by a limited number of parameters, or provide extreme tradeoffs between simulation performance and modeling accuracy that are not ideal for exascale simulations. At one extreme are low-level discrete-event simulators, which provide high accuracy, but are prohibitively slow for large-scale simulations. At the opposite extreme are abstract modeling approaches that are sufficiently fast, but tend to support a limited number of parameters, while also lacking accuracy due to machine-specific behaviors that deviate from anticipated models. In this paper, we improve upon existing abstract modeling approaches by leveraging symbolic regression to automatically discover an underlying multi-parameter model of the system and application that captures difficult-to-understand behaviors. For three High Performance Computing (HPC) applications running on Vulcan, we show that symbolic regression provided modeling accuracies that were $3.5 times, 4.6 times$, and $6.2 times$ better than analytical models developed using linear regression.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127290840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Feedback-Based Resource Allocation for Batch Scheduling of Scientific Workflows 基于反馈的科学工作流批量调度资源分配
2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188055
Carl Witt, Dennis Wagner, U. Leser
{"title":"Feedback-Based Resource Allocation for Batch Scheduling of Scientific Workflows","authors":"Carl Witt, Dennis Wagner, U. Leser","doi":"10.1109/HPCS48598.2019.9188055","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188055","url":null,"abstract":"A scientific workflow is a set of interdependent compute tasks orchestrating large scale data analyses or in-silico experiments. Workflows often comprise thousands of tasks with heterogeneous resource requirements that need to be executed on distributed resources. Many workflow engines solve parallelization by submitting tasks to a batch scheduling system, which requires resource usage estimates that have to be provided by users. We investigate the possibility to improve upon inaccurate user estimates by incorporating an online feedback loop between workflow scheduling, resource usage prediction, and measurement.Our approach can learn resource usage of arbitrary type; in this paper, we demonstrate its effectiveness by predicting peak memory usage of tasks, as it is an especially sensitive resource type that leads to task termination if underestimated and leads to decreased throughput if overestimated.We compare online versions of standard machine learning models for peak memory usage prediction and analyze their interactions with different workflow scheduling strategies. By means of extensive simulation experiments, we found that the proposed feedback mechanism improves resource utilization and execution times compared to typical user estimates.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125253992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Performance Prediction for Power-Capped Applications based on Machine Learning Algorithms 基于机器学习算法的功率限制应用性能预测
2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188144
Bo Wang, Jannis Klinkenberg, D. Ellsworth, C. Terboven, Matthias S. Müller
{"title":"Performance Prediction for Power-Capped Applications based on Machine Learning Algorithms","authors":"Bo Wang, Jannis Klinkenberg, D. Ellsworth, C. Terboven, Matthias S. Müller","doi":"10.1109/HPCS48598.2019.9188144","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188144","url":null,"abstract":"Growing high performance computing (HPC) clusters are encountering a power wall due to limitations in the surrounding infrastructure. Maximizing a cluster’s performance in the presence of a limited power budget is an open problem with high relevance and requires a deep understanding of application performance and power draw.Hardware components with the same technical specification have distinct power efficiencies and applications running on those components have diverse power profiles. Enforcing a power limit on individual components changes the performance characteristics. In this work, we investigate and quantity power- and performance-characteristics of various applications. Further, we present a systematic methodology to collect corresponding monitoring data and apply machine learning (ML) techniques to predict the performance under particular power caps. The observed prediction error is under 3% in most cases, which is in the same range of performance variation as application runs without a power cap.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114810503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy performances of a routing protocol based on fuzzy logic approach in an underwater wireless sensor networks 基于模糊逻辑的水下无线传感器网络路由协议能量性能研究
2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188061
Hajar Bennouri, A. Berqia
{"title":"Energy performances of a routing protocol based on fuzzy logic approach in an underwater wireless sensor networks","authors":"Hajar Bennouri, A. Berqia","doi":"10.1109/HPCS48598.2019.9188061","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188061","url":null,"abstract":"Underwater Wireless sensor Networks (UWSNs) is one of the promising topics in wireless communications. High transmission power and lengthy data packet transmission consume a significant amount of energy due to the difficult type of communication in this environment. Several methods are used to solve or reduce this problem. In this article, we are interested to study the impact of using fuzzy logic approach in a routing protocol to evaluate the energy performance of an underwater wireless sensor network. We implement the FLOVP (Fuzzy Logic optimized Vector Protocol) routing protocol which is an improved version of the Vector Based Forwarder VBF routing protocol in aqua-sim simulator for underwater wireless sensor based on NS2 to compare its performance in term of energy consumed with the original VBF routing protocol.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"0 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114917029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Cost Reduction Bounds of Proactive Management Based on Request Prediction
2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188199
R. Milocco, P. Minet, É. Renault, S. Boumerdassi
{"title":"Cost Reduction Bounds of Proactive Management Based on Request Prediction","authors":"R. Milocco, P. Minet, É. Renault, S. Boumerdassi","doi":"10.1109/HPCS48598.2019.9188199","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188199","url":null,"abstract":"Data Centers (DCs) need to manage their servers periodically to meet user demand efficiently. Since the cost of the energy employed to serve the user demand is lower when DC settings (e.g. number of active servers) are done a priori (proactively), there is a great interest in studying different proactive strategies based on predictions of requests. The amount of savings in energy cost that can be achieved depends not only on the selected proactive strategy but also on the statistics of the demand and the predictors used. Despite its importance, due to the complexity of the problem it is difficult to find studies that quantity the savings that can be obtained. The main contribution of this paper is to propose a generic methodology to quantity the possible cost reduction using proactive management based on predictions. Thus, using this method together with past data it is possible to quantity the efficiency of different predictors as well as optimize proactive strategies. In this paper, the cost reduction is evaluated using both ARMA (Auto Regressive Moving Average) and LV (Last Value) predictors. We then apply this methodology to the Google dataset collected over a period of 29 days to evaluate the benefit that can be obtained with those two predictors in the considered DC.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127661621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Approximating Memory-bound Applications on Mobile GPUs 在移动gpu上近似内存绑定应用程序
2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188051
Daniel Maier, Nadjib Mammeri, Biagio Cosenza, B. Juurlink
{"title":"Approximating Memory-bound Applications on Mobile GPUs","authors":"Daniel Maier, Nadjib Mammeri, Biagio Cosenza, B. Juurlink","doi":"10.1109/HPCS48598.2019.9188051","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188051","url":null,"abstract":"Approximate computing techniques are often used to improve the performance of applications that can tolerate some amount of impurity in the calculations or data. In the context of embedded and mobile systems, a broad number of applications have exploited approximation techniques to improve performance and overcome the limited capabilities of the hardware. On such systems, even small performance improvements can be sufficient to meet scheduled requirements such as hard real-time deadlines. We study the approximation of memory-bound applications on mobile GPUs using kernel perforation, an approximation technique that exploits the availability of fast GPU local memory to provide high performance with more accurate results. Using this approximation technique, we approximated six applications and evaluated them on two mobile GPU architectures with very different memory layouts: a Qualcomm Adreno 506 and an ARM Mali T860 MP2. Results show that, even when the local memory is not mapped to dedicated fast memory in hardware, kernel perforation is still capable of $1.25times$ speedup because of improved memory layout and caching effects. Mobile GPUs with local memory show a speedup of up to $1.38times$.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"151 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132904254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Thin-Threads: An Approach for History-Based Monte Carlo on GPUs 细线程:gpu上基于历史的蒙特卡罗方法
2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI: 10.1109/HPCS48598.2019.9188080
R. Bleile, P. Brantley, D. Richards, S. Dawson, M. S. McKinley, M. O’Brien, H. Childs
{"title":"Thin-Threads: An Approach for History-Based Monte Carlo on GPUs","authors":"R. Bleile, P. Brantley, D. Richards, S. Dawson, M. S. McKinley, M. O’Brien, H. Childs","doi":"10.1109/HPCS48598.2019.9188080","DOIUrl":"https://doi.org/10.1109/HPCS48598.2019.9188080","url":null,"abstract":"A graphics processing unit (GPU) has become a core technology for modern supercomputers. Applications that once ran on supercomputers are being forced to make significant changes to their designs to utilize these new machines. This paper introduces the concept of Thin-Threads as a method for history-based Monte Carlo transport applications on GPUs. The key principles behind Thin-Threads are light memory usage and communication and managing data race issues via atomics. We show that we can achieve a 10x speedup when moving from the traditional method to Thin-Threads on GPUs. Additionally, we demonstrate the viability of the Thin-Threads model at scale for GPU and CPU platforms.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127495885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信