2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)最新文献

An Efficient Channel Model for Evaluating Wireless NoC Architectures 一种评估无线NoC体系结构的有效信道模型

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-26 DOI: 10.1109/SBAC-PADW.2016.23

Michael Opoku Agyeman, Quoc-Tuan Vien, G. Hill, S. J Turner, T. Mak

引用次数: 3

PY-PITS: A Scalable Python Runtime System for the Computation of Partially Idempotent Tasks PY-PITS:用于计算部分幂等任务的可伸缩Python运行时系统

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.10

E. Borin, C. Benedicto, I. Rodrigues, F. Pisani, M. Tygel, M. Breternitz

引用次数: 6

A Comparative Study of SYCL, OpenCL, and OpenMP SYCL、OpenCL和OpenMP的比较研究

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.19

H. C. D. Silva, F. Pisani, E. Borin

{"title":"A Comparative Study of SYCL, OpenCL, and OpenMP","authors":"H. C. D. Silva, F. Pisani, E. Borin","doi":"10.1109/SBAC-PADW.2016.19","DOIUrl":"https://doi.org/10.1109/SBAC-PADW.2016.19","url":null,"abstract":"Recent trends indicate that future computing systems will be composed by a group of heterogeneous computing devices, including CPUs, GPUs, and other hardware accelerators. These devices provide increased processing performance, however, creating efficient code for them may require that programmers manage memory assignments and use specialized APIs, compilers, or runtime systems, thus making their programs dependent on specific tools. In this scenario, SYCL is an emerging C++ programming model for OpenCL that allows developers to write code for heterogeneous computing devices that are compatible with standard C++ compilation frameworks. In this paper, we analyze the performance and programming characteristics of SYCL, OpenMP, and OpenCL using both a benchmark and a real-world application. Our performance results indicate that programs that rely on available SYCL runtimes are not on par with the ones based on OpenMP and OpenCL yet. Nonetheless, the gap is getting smaller if we consider the results reported by previous studies. In terms of programmability, SYCL presents itself as a competitive alternative to OpenCL, requiring fewer lines of code to implement kernels and also fewer calls to essential API functions and methods.","PeriodicalId":186179,"journal":{"name":"2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132634380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Parallelism and Scalability: A Solution Focused on the Cloud Computing Processing Service Billing 并行性和可伸缩性:一种专注于云计算处理服务计费的解决方案

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.14

Emmanoel M. De Sousa Junior, I. Sardiña, Frederico Lopes

{"title":"Parallelism and Scalability: A Solution Focused on the Cloud Computing Processing Service Billing","authors":"Emmanoel M. De Sousa Junior, I. Sardiña, Frederico Lopes","doi":"10.1109/SBAC-PADW.2016.14","DOIUrl":"https://doi.org/10.1109/SBAC-PADW.2016.14","url":null,"abstract":"The application scheduling is an important requirement in the cloud computing context. It allows to define the required resources to execute applications tasks following predefined criteria, for instance, maximum execution time, number of virtual machines, volume of data, among others. Selecting process to choose the most appropriate execution structure is driven by scheduling algorithms. This paper proposes a scheduling mechanism for data processing in cloud computing environments. Such mechanism analyzes some specific variables in the business context of a software house specialized in software for lawyers and law offices. The main goal of this mechanism is to fulfill the seasonal company's demand using IaaS services and considering two policies: (i) the maximum execution time allowed by the application may not be exceeded and (ii) the data have to be processed considering the lowest possible monetary cost. The proposed solution generates strategies to select the best set of virtual machines to process the current bunch of data considering the amount of data, the estimated execution time for each specific strategy and the monetary cost of the virtual machines sets. In the context of this work, the strategy concept means the schedule of a set of virtual machines to process a specific amount of data, load balancing decisions and the parallelism of application's execution flow. The proposed solution has resulted in great impact for that company since it allowed the vertiginous increase of the amount of clients served.","PeriodicalId":186179,"journal":{"name":"2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122128907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Benchmark on Multi Improvement Neighborhood Search Strategies in CPU/GPU Systems CPU/GPU系统中多改进邻域搜索策略的基准研究

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.17

E. Rios, I. M. Coelho, L. Ochi, Cristina Boeres, R. Farias

{"title":"A Benchmark on Multi Improvement Neighborhood Search Strategies in CPU/GPU Systems","authors":"E. Rios, I. M. Coelho, L. Ochi, Cristina Boeres, R. Farias","doi":"10.1109/SBAC-PADW.2016.17","DOIUrl":"https://doi.org/10.1109/SBAC-PADW.2016.17","url":null,"abstract":"In combinatorial optimization problems, the neighborhood search (NS) is a fundamental component for local search based heuristics. It consists of selecting a solution from a high cardinality set of neighbor solutions, by means of operations called moves. To perform this search, NS algorithms usually adopt two main approaches: selecting the first or best improving move. The Multi Improvement (MI) strategy is a recently proposed method that consists in exploring simultaneously multiple move operations during the NS phase aiming to reach good quality solutions with shorter computational steps. This paper presents a benchmark for MI strategies in hybrid CPU/GPU systems. This technique efficiently explores the CPU processing power together with the massive parallelism achieved by modern GPUs, emerging as an efficient alternative for classic CPU neighborhood search strategies. The advantage of this approach depends heavily on finding the best tradeoff between CPU and GPU processing, as well as minimizing the memory transfers involved in the process. In the experiments, several MI configurations were tested in a hybrid CPU/GPU environment presenting better results than classical neighborhood search strategies for the Minimum Latency Problem, a hard combinatorial optimization problem.","PeriodicalId":186179,"journal":{"name":"2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120842517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

A Hybrid Parallel Algorithm for the Auction Algorithm in Multicore Systems 多核系统中竞价算法的混合并行算法

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.21

A. P. Nascimento, C. Vasconcelos, F. S. Jamel, A. Sena

引用次数: 5

A Processor Workload Distribution Algorithm for Massively Parallel Applications 面向大规模并行应用的处理器工作负载分配算法

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.13

Serge Midonnet, Achille Wattelar

引用次数: 0

Outline of a Thick Control Flow Architecture 厚控制流体系结构的概要

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.9

M. Forsell, J. Roivainen, V. Leppänen

{"title":"Outline of a Thick Control Flow Architecture","authors":"M. Forsell, J. Roivainen, V. Leppänen","doi":"10.1109/SBAC-PADW.2016.9","DOIUrl":"https://doi.org/10.1109/SBAC-PADW.2016.9","url":null,"abstract":"The recently invented thick control flow (TCF) model packs together an unbounded number of fibers, thread-like computational entities, flowing through the same control path. This promises to simplify parallel programming by partially eliminating looping and artificial thread arithmetics. In this paper we outline an architecture for efficiently executing programs written for the TCF model. It features scalable latency hiding via replication of instructions, radical synchronization cost reduction via a wave-based synchronization mechanism, and improved low-level parallelism exploitation via chaining of functional units. Replication of instructions is supported by a dynamic multithreading-like mechanism, which saves the fiber-wise data into special replicated register blocks. The architecture facilitates programmers with compact, unbounded notation of fibers and groups of them together with strong synchronous shared memory algorithmics. According to evaluations, the architecture is able to efficiently handle workloads featuring computational elements with the same control flow, independently of the number of elements. In its turn, this pays out as improved performance and lower power consumption due to elimination of redundant parts of computation and machinery.","PeriodicalId":186179,"journal":{"name":"2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123868455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

A Dynamic Load Balance Algorithm for the S4 Parallel Stream Processing Engine S4并行流处理引擎的动态负载平衡算法

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.12

V. Gil-Costa, Nicolás Hidalgo, Erika Rosas, Mauricio Marín

引用次数: 1

Thread Footprint Analysis for the Design of Multithreaded Applications and Multicore Systems 多线程应用和多核系统设计中的线程占用分析

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.18

R. Santos, Ricardo Aguiar, Paulo Soken, Samuel Ferraz, Liana Duenha

引用次数: 0