2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)最新文献_第2页

Task Scheduling in Sucuri Dataflow Library Sucuri数据流库中的任务调度

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.15

Rafael J. N. Silva, Brunno F. Goldstein, Leandro Santiago, A. Sena, L. A. J. Marzulo, Tiago A. O. Alves, F. França

引用次数: 11

Performance Optimization for SpMV on Multi-GPU Systems Using Threads and Multiple Streams 基于多线程和多流的多gpu系统SpMV性能优化

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.20

Ping Guo, Changjiang Zhang

引用次数: 4

Towards a GPU Abstraction for Lua 面向Lua的GPU抽象

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.11

Raphael Ribeiro, Paulo Motta

引用次数: 2

Synchronization-Free Automatic Parallelization for Arbitrarily Nested Affine Loops 任意嵌套仿射循环的无同步自动并行化

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.16

T. Klimek, M. Pałkowski, W. Bielecki

引用次数: 0

An Efficient 2D Router Architecture for Extending the Performance of Inhomogeneous 3D NoC-Based Multi-Core Architectures 一种扩展非均匀3D多核架构性能的高效2D路由器架构

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.22

Michael Opoku Agyeman, W. Zong

{"title":"An Efficient 2D Router Architecture for Extending the Performance of Inhomogeneous 3D NoC-Based Multi-Core Architectures","authors":"Michael Opoku Agyeman, W. Zong","doi":"10.1109/SBAC-PADW.2016.22","DOIUrl":"https://doi.org/10.1109/SBAC-PADW.2016.22","url":null,"abstract":"To meet the performance and scalability demands of the fast-paced technological growth towards exascale and Big-Data processing with the performance bottleneck of conventional metal based interconnects, alternative interconnect fabrics such as inhomogeneous three dimensional integrated Network-on-Chip (3D NoC) has emanated as a cost-effective solution for emerging multi-core design. However, these interconnects trade-off optimized performance for cost by restricting the number of area and power hungry 3D routers. Consequently, in this paper, we propose a low-latency adaptive router with a low-complexity single-cycle bypassing mechanism to alleviate the performance degradation due to the slow 2D routers in inhomogeneous 3D NoCs. By combining the low-complexity bypassing technique with adaptive routing, the proposed router is able to balance the traffic in the network to reduce the average packet latency under various traffic loads. Simulation shows that, the proposed router can reduce the average packet delay by an average of 45% in 3D NoCs.","PeriodicalId":186179,"journal":{"name":"2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131463621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Dataflow to Hardware Synthesis Framework on FPGAs fpga上的数据流到硬件综合框架

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) Pub Date : 2016-10-01 DOI: 10.1109/SBAC-PADW.2016.24

Youngsoo Kim, Shrikant S. Jadhav, C. Gloster

{"title":"Dataflow to Hardware Synthesis Framework on FPGAs","authors":"Youngsoo Kim, Shrikant S. Jadhav, C. Gloster","doi":"10.1109/SBAC-PADW.2016.24","DOIUrl":"https://doi.org/10.1109/SBAC-PADW.2016.24","url":null,"abstract":"We present a dataflow based performance estimation and synthesis framework that will help hardware designers quantify the algorithm performance and synthesize their HW designs onto Field Programmable Gate Arrays (FPGAs). Typically, Digital Signal Processing (DSP) systems are designed by making gradual architectural choices in HW refinement steps. These decisions are based on performance quantification by high level DSP algorithm developers and HW implementation engineers. The main obstacle to this refinement is the provision of reasonably correct performance estimations to guide HW designers in Design Space Exploration (DSE) at an early stage. HW designers face challenges when they need to quantify the performance of their designs, especially when resources are limited. We use dataflow models by describing their hardware detail only as necessary. Dataflow based performance estimation achieves the efficient generation of qualitative and quantitative parameters for the assessment of HW candidates. Reconfigurable logic can be used to off-load the primary computational kernel onto a custom computing machine in order to reduce execution time by an order of magnitude as compared to kernel execution on a general purpose processor. Specifically, FPGAs can be used to accelerate these kernels using hardware-based custom logic implementations. In this paper, we demonstrate a framework for algorithm acceleration from the dataflow to synthesized HDL design. Experimental results show a linear speedup by adding reasonably small processing elements in FPGA as opposed to using a software implementation running on a typical general purpose processor.","PeriodicalId":186179,"journal":{"name":"2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116655096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9