2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines最新文献_第5页

Acceleration of SQL Restrictions and Aggregations through FPGA-Based Dynamic Partial Reconfiguration 基于fpga的动态局部重构加速SQL约束和聚合

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.38

C. Dennl, Daniel Ziener, J. Teich

引用次数: 56

Memory Access Scheduling on the Convey HC-1 HC-1上的内存访问调度

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.55

Zheming Jin, J. Bakos

引用次数: 2

Hardware-Software Codesign for Embedded Numerical Acceleration 嵌入式数值加速的软硬件协同设计

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.27

Ranko Sredojevic, A. Wright, V. Stojanović

引用次数: 0

Global Control and Storage Synthesis for a System Level Synthesis Approach 系统级综合方法的全局控制与存储综合

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.61

Shuo Li, Nasim Farahini, A. Hemani

引用次数: 4

The Impact of Hardware Communication on a Heterogeneous Computing System 硬件通信对异构计算系统的影响

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.43

Shanyuan Gao, Bin Huang, R. Sass

引用次数: 3

An Evaluation of High-Performance Embedded Processing on MPPAs MPPAs的高性能嵌入式处理评价

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.44

Zain-ul-Abdin, B. Svensson

引用次数: 0

Exploiting Input Parameter Uncertainty for Reducing Datapath Precision of SPICE Device Models 利用输入参数不确定性降低SPICE器件模型数据路径精度

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.28

Nachiket Kapre

引用次数: 2

A Reconfigurable Architecture for 1-D and 2-D Discrete Wavelet Transform 一维和二维离散小波变换的可重构结构

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.23

Qing Sun, Jiang Jiang, Yongxin Zhu, Yuzhuo Fu

引用次数: 7

Latency-Optimized Networks for Clustering FPGAs 集群fpga的延迟优化网络

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.49

Trevor Bunker, S. Swanson

{"title":"Latency-Optimized Networks for Clustering FPGAs","authors":"Trevor Bunker, S. Swanson","doi":"10.1109/FCCM.2013.49","DOIUrl":"https://doi.org/10.1109/FCCM.2013.49","url":null,"abstract":"The data-intensive applications that will shape computing in the coming decades require scalable architectures that incorporate scalable data and compute resources and can support random requests to unstructured (e.g., logs) and semi-structured (e.g., large graph, XML) data sets. To explore the suitability of FPGAs for these computations, we are constructing an FPGAbased system with a memory capacity of 512 GB from a collection of 32 Virtex-5 FPGAs spread across 8 enclosures. This paper describes our work in exploring alternative interconnect technologies and network topologies for FPGA-based clusters. The diverse interconnects combine inter-enclosure high-speed serial links and wide, single-ended intra-enclosure on-board traces with network topologies that balance network diameter, network throughput, and FPGA resource usage. We discuss the architecture of high-radix routers in FPGAs that optimize for the asymmetry between the interand intra-enclosure links. We analyze the various interconnects that aim to efficiently utilize the prototype's total switching capacity of 2.43 Tb/s. The networks we present have aggregate throughputs up to 51.4 GB/s for random traffic, diameters as low as 845 nanoseconds, and consume less than 12% of the FPGAs' logic resources.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124484267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Minerva: Accelerating Data Analysis in Next-Generation SSDs Minerva:加速下一代ssd的数据分析

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.46

Arup De, M. Gokhale, Rajesh K. Gupta, S. Swanson

{"title":"Minerva: Accelerating Data Analysis in Next-Generation SSDs","authors":"Arup De, M. Gokhale, Rajesh K. Gupta, S. Swanson","doi":"10.1109/FCCM.2013.46","DOIUrl":"https://doi.org/10.1109/FCCM.2013.46","url":null,"abstract":"Emerging non-volatile memory (NVM) technologies have DRAM-like latency with storage-like density, offering unique capability to analyze large data sets significantly faster than flash or disk storage. However, the hybrid nature of these NVM technologies such as phase-change memory (PCM) make it difficult to use them to best advantage in the memory-storage hierarchy. These NVMs lack the fast write latency required of DRAM and are thus not suitable as DRAM equivalent on the memory bus, yet their low latency even in random access patterns is not easily exploited over an I/O bus. In this work, we describe an FPGA-based system to execute application-specific operations in the NVM controller and evaluate its performance on two microbenchmarks and a keyvalue store. Our system Minerva1extends the conventional solidstate drive (SSD) architecture to offload data or I/O intensive application code to the SSD to exploit the low latency and high internal bandwidth of NVMs. Performing computation in the FPGA-based NVM storage controller significantly reduces data traffic between the host and storage and serves as an offload engine for data analysis workloads. A runtime library enables the programmer to offload computations to the SSD without dealing with the complications of the underlying architecture and inter-controller communication management. We have implemented a prototype of Minerva on the BEE3 FPGA system. We compare the performance of Minerva to a state of the art PCIe-attached PCM-based SSD. Minerva improves performance by an order of magnitude on two microbenchmarks. Minerva based key-value store performs up to 5.2 M get operations/s and 4.0 M set operations/s which is 7.45× and 9.85× higher than the PCM-based SSD that uses the conventional I/O architecture. This huge improvement comes from the reduction of data transfer between the storage to the host and the FPGA-based data processing in the SSD.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116611081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 68