2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)最新文献_第2页

Dynamic Reconfiguration of Data Parallel Programs 数据并行程序的动态重构

2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PAD.2016.32

Vinícius Dias, Wagner Meira Jr, D. Guedes

引用次数: 2

A Parallelization of a Simulated Annealing Approach for 0-1 Multidimensional Knapsack Problem Using GPGPU 基于GPGPU的0-1多维背包问题模拟退火并行化

2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PAD.2016.25

Bianca de Almeida Dantas, E. Cáceres

{"title":"A Parallelization of a Simulated Annealing Approach for 0-1 Multidimensional Knapsack Problem Using GPGPU","authors":"Bianca de Almeida Dantas, E. Cáceres","doi":"10.1109/SBAC-PAD.2016.25","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2016.25","url":null,"abstract":"In the last decades, with the advances in multicore/manycore architectures, it became interesting to design algorithms which can take advantage of such architectures aiming the achievement of more efficient algorithms to solve difficult problems. A large number of real-world problems solved with the help of computer programs demand faster or better quality solutions. Some of these problems can be modeled as classical theoretical problems, such as the 0-1 multidimensional knapsack problem (0-1 MKP), known to belong to the NP-hard class of problems, for which we can not obtain an exact solution efficiently. This motivates the search for alternative strategies which can achieve good quality approximate solutions, like metaheuristics, and also different ways to enable their execution in reduced times, such as parallel algorithms which explore multicore/manycore architectures. In this work we describe a parallelization of a simulated annealing approachusing GPGPU to solve 0-1 MKP and compare our results to previous works in order to prove the viability of its use.","PeriodicalId":361160,"journal":{"name":"2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134330441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Planning Your SQL-on-Hadoop Deployment Using a Low-Cost Simulation-Based Approach 使用低成本的基于模拟的方法规划sql在hadoop上的部署

2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PAD.2016.31

Jun Liu, Bianny Bian, Samantika Sury

{"title":"Planning Your SQL-on-Hadoop Deployment Using a Low-Cost Simulation-Based Approach","authors":"Jun Liu, Bianny Bian, Samantika Sury","doi":"10.1109/SBAC-PAD.2016.31","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2016.31","url":null,"abstract":"The term \"SQL-on-Hadoop\" has recently gained significant traction [19]. Impala represents a new emerging class of SQL-on-Hadoop systems that exploit a shared-nothing parallel database architecture over Hadoop. Impala was designed to close the gap of near real time data analytics on Hadoop stack and it has shown itself to be significantly more efficient than other SQL-on-Hadoop solutions [13]. However, it is not a trivial task to leverage Impala for handling queries with different business demands [12]. Improperly deploying an Impala cluster may not give you the expected performance you want. In this paper, we propose a novel Impala simulation framework to help IT professionals to understand its performance behavior. This would simplify the deployment planning work required to enable big data analytics on SQL-on-Hadoop systems. An Impala simulator models the behavior of a complete software stack and simulates the activities of cluster components such as storage, network, processors and memory. Moreover, the accuracy of the simulation remain high in response to both software configuration and hardware changes, it reflects the expected scaling trend with low cost overhead and fast simulation speed. The Impala simulator has been validated against various S/W and H/W configurations, using the well-known TPC-DS benchmark [15], and the simulation results are valid and expected. A use case is provided to show how one would use the simulator to solve their performance and deployment issues.","PeriodicalId":361160,"journal":{"name":"2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129841443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

A Study of Power-Performance Modeling Using a Domain-Specific Language 基于领域特定语言的功率性能建模研究

2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PAD.2016.19

M. Umar, J. Meredith, J. Vetter, K. Cameron

{"title":"A Study of Power-Performance Modeling Using a Domain-Specific Language","authors":"M. Umar, J. Meredith, J. Vetter, K. Cameron","doi":"10.1109/SBAC-PAD.2016.19","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2016.19","url":null,"abstract":"Energy use is now a first-class design constraint in high-performance systems and applications. Improving our understanding of application energy consumption in diverse, heterogeneous systems will be essential to efficient operation. For example, power limits in large scale parallel and distributed systems will require optimizing performance under energy constraints. However, with increased levels of parallelism, complex memory hierarchies, hardware heterogeneity, and diverse programming models and interfaces, improving performance and energy efficiency simultaneously is exceedingly difficult. Our thesis is that estimating energy use, either a priori or as soon as possible at runtime, will be essential to future systems. Such estimates must adapt with changes in applications across hardware configurations. Existing approaches offer insight and detail, but typically are too cumbersome to enable adaptation at runtime or lack portability or accuracy. To overcome these limitations, we propose two energy estimation techniques which use the Aspen domain specific language for performance modeling: ACEE (Algorithmic and Categorical Energy Estimation), a combination of analytical and empirical modeling techniques embedded in a runtime framework that leverages Aspen, and AEEM (Aspen's Embedded Energy Modeling), a system level coarse-grained energy estimation technique that uses performance modeling from Aspen to generate energy estimations at runtime. This paper presents methodology of the models and examines their accuracy as well as their advantages and challenges in several use cases.","PeriodicalId":361160,"journal":{"name":"2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127243871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

REPP-H: Runtime Estimation of Power and Performance on Heterogeneous Data Centers REPP-H:异构数据中心的功率和性能运行时估计

2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PAD.2016.27

Rajiv Nishtala, X. Martorell, V. Petrucci, D. Mossé

{"title":"REPP-H: Runtime Estimation of Power and Performance on Heterogeneous Data Centers","authors":"Rajiv Nishtala, X. Martorell, V. Petrucci, D. Mossé","doi":"10.1109/SBAC-PAD.2016.27","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2016.27","url":null,"abstract":"One of the main challenges in data center systems is operating under certain Quality of Service (QoS) while minimizing power consumption. Increasingly, data centers are adopting heterogeneous server architectures with different power-performance trade-offs. This requires careful understanding of the application behavior across multiple architectures at runtime so as to enable meeting specified power and performance requirements. In this work, we present and evaluate REPP-H (Runtime Estimation of Performance and Power on Heterogeneous data centers). REPP-H leverages hardware performance counters available on all major server architectures to ensure a highly responsive power capping mechanism and delivering a minimum performance in a single step. We experimentally show that REPP-H can successfully estimate power and performance of several single-threaded andmultiprogrammed workloads. The average errors on ARM, AMD and Intel architectures are, respectively, 7.1%, 9.0%, 7.1% when predicting performance, and 6.0%, 6.5%, 8.1% when predicting power on those heterogeneous servers.","PeriodicalId":361160,"journal":{"name":"2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115123348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Speeding Up Stencil Computations with Kernel Convolution 核卷积加速模板计算

2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PAD.2016.18

G. C. Januario, Bryan S. Rosenburg, Yoonho Park, M. Perrone, J. Moreira, T. Carvalho

引用次数: 2

Value Reuse Potential in ARM Architectures ARM架构中的价值重用潜力

2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PAD.2016.30

R. C. D. Moura, Giovane O. Torres, M. Pilla, L. Pilla, Amarildo T. da Costa, F. França

引用次数: 2

Scheduling Matters: Area-Oriented Heuristic for Resource Management 调度事项:面向区域的资源管理启发式

2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PAD.2016.35

Jeremy Benson, Trilce Estrada, A. Rosenberg, M. Taufer

引用次数: 1

HYPPO: A Hybrid, Piecewise Polynomial Modeling Technique for Non-Smooth Surfaces HYPPO:一种非光滑曲面的混合分段多项式建模技术

2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PAD.2016.12

Travis Johnston, Connor Zanin, M. Taufer

{"title":"HYPPO: A Hybrid, Piecewise Polynomial Modeling Technique for Non-Smooth Surfaces","authors":"Travis Johnston, Connor Zanin, M. Taufer","doi":"10.1109/SBAC-PAD.2016.12","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2016.12","url":null,"abstract":"The number and diversity of tunable parameters in applications makes predicting settings that achieve optimal performance challenging. Complicating matters is the fact that resources are increasingly shared among computational tasks (for example, in cloud environments). Choosing any setting that yields near-optimal performance runs the risk of overusing shared resources. Building accurate models that capture the complicated interplay of parameters is crucial in order to maximize performance with minimal resource impact. Traditional techniques tend to fall short when modeling performance. One reason is that performance surfaces are often irregular but most traditional techniques are designed to produce smooth models. In this paper we introduce a hybrid modeling technique that combines the strengths of surrogate-based modeling (SBM) and k nearest-neighbor regression (kNN) into a single method called HYPPO. The hybrid method is a piecewise polynomial model composed of many small, local models. We demonstrate that HYPPO significantly improves overall prediction accuracy compared with SBM and kNN.","PeriodicalId":361160,"journal":{"name":"2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"147 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115192340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Automatic Insertion of Copy Annotation in Data-Parallel Programs 数据并行程序中复制注释的自动插入

2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PAD.2016.13

G. Mendonca, B. Guimarães, P. Alves, Fernando Magno Quintão Pereira, M. Pereira, G. Araújo

{"title":"Automatic Insertion of Copy Annotation in Data-Parallel Programs","authors":"G. Mendonca, B. Guimarães, P. Alves, Fernando Magno Quintão Pereira, M. Pereira, G. Araújo","doi":"10.1109/SBAC-PAD.2016.13","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2016.13","url":null,"abstract":"Directive-based programming models, such as OpenACC and OpenMP arise today as promising techniques to support the development of parallel applications. These systems allow developers to convert a sequential program into a parallel one with minimum human intervention. However, inserting pragmas into production code is a difficult and error-prone task, often requiring familiarity with the target program. This difficulty restricts the ability of developers to annotate code that they have not written themselves. This paper provides one fundamental component in the solution of this problem. We introduce a static program analysis that infers the bounds of memory regions referenced in source code. Such bounds allow us to automatically insert data-transfer primitives, which are needed when the parallelized code is meant to be executed in an accelerator device, such as a GPU. To validate our ideas, we have applied them onto Polybench, using two different architectures: Nvidia and Qualcomm-based. We have successfully analyzed 98% of all the memory accesses in Polybench. This result has enabled us to insert automatic annotations into those benchmarks leading to speedups of over 100x.","PeriodicalId":361160,"journal":{"name":"2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114802426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15