2014 International Conference on Field-Programmable Technology (FPT)最新文献_第4页

A flexible interface architecture for reconfigurable coprocessors in embedded multicore systems using PCIe Single-root I/O virtualization 使用PCIe单根I/O虚拟化的嵌入式多核系统中可重构协处理器的灵活接口架构

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082780

O. Sander, S. Bähr, Enno Lübbers, T. Sandmann, Viet Vu Duy, J. Becker

引用次数: 9

HW acceleration of multiple applications on a single FPGA 在单个FPGA上实现多个应用的硬件加速

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082797

Yidi Liu, Benjamin Carrión Schäfer

引用次数: 0

Accelerating transfer entropy computation 加速传递熵计算

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082754

Shengjia Shao, Ce Guo, W. Luk, Stephen Weston

引用次数: 12

Logic emulation in the megaLUT era - Moore's Law beats Rent's Rule 超级计算机时代的逻辑仿真——摩尔定律打败了“租金法则”

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082742

M. Butts

{"title":"Logic emulation in the megaLUT era - Moore's Law beats Rent's Rule","authors":"M. Butts","doi":"10.1109/FPT.2014.7082742","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082742","url":null,"abstract":"Throughout its twenty-five year history, logic emulation architectures have been governed by Rent's Rule. This empirical observation, first used to build 1960s mainframes, predicts the average number of cut nets that result when a digital module is arbitrarily partitioned into multiple parts, such as the FPGAs of a logic emulator. A fundamental advantage of emulation is that, unlike most devices, FPGAs always grow in capacity according to Moore's Law, just as the designs to be emulated have grown. Unfortunately packaging technology advances at a far slower pace, leaving emulators short on the pins demanded by Rent's Rule. Many cut nets are now sent through each package pin, which costs speed, power and area. At today's system-on-chip level of design, the number of system-level modules is growing, while their sizes are remaining constant. In the meantime, FPGAs have grown from a handful of logic lookup tables (LUTs) at the beginning to over a million LUTs today. At this scale, an entire system-level module such as an advanced 64-bit CPU can fit inside a single FPGA. Fewer module-internal nets need be cut, so Rent's Rule constraints are relaxing. Fewer and higher-level cut nets means logic emulation with megaLUT FPGAs is becoming faster, cooler, smaller, cheaper, and more reliable. FPGA's Moore's Law scaling is escaping from Rent's Rule.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"36 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81317231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A complementary architecture for high-speed true random number generator 一种高速真随机数发生器的互补结构

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082786

Xian-wei Yang, R. Cheung

引用次数: 8

Design re-use for compile time reduction in FPGA high-level synthesis flows 设计重用以减少FPGA高级合成流的编译时间

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082746

Marcel Gort, J. Anderson

引用次数: 19

Evaluation of SNMP-like protocol to manage a NoC emulation platform 对管理NoC仿真平台的类snmp协议的评估

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082776

O. A. D. L. Junior, V. Fresse, F. Rousseau

引用次数: 8

Analyzing the impact of heterogeneous blocks on FPGA placement quality 分析异构块对FPGA放置质量的影响

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082750

Chang Xu, Wentai Zhang, Guojie Luo

引用次数: 2

Improve memory access for achieving both performance and energy efficiencies on heterogeneous systems 改进内存访问，在异构系统上实现性能和能源效率

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082759

Hongyuan Ding, Miaoqing Huang

{"title":"Improve memory access for achieving both performance and energy efficiencies on heterogeneous systems","authors":"Hongyuan Ding, Miaoqing Huang","doi":"10.1109/FPT.2014.7082759","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082759","url":null,"abstract":"Hardware accelerators are capable of achieving significant performance improvement for many applications. In this work we demonstrate that it is critical to provide sufficient memory access bandwidth for accelerators to improve the performance and reduce energy consumption. We use the scale-invariant feature transform (SIFT) algorithm as a case study in which three bottleneck stages are accelerated on hardware logic. Based on different memory access patterns of SIFT algorithms, two different approaches are designed to accelerate different functions in SIFT on the Xilinx Zynq-7045 device. In the first approach, convolution is accelerated by designing fully customized hardware accelerator. On top of it, three interfacing methods are analyzed. In the second approach, a distributed multi-processor hardware system with its programming model is built to handle inconsecutive memory accesses. Furthermore, the last level cache (LLC) on the host processor is shared by all slaves to achieve better performance. Experiment results on the Zynq-7045 device show that the hybrid design in which two approaches are combined can achieve ~10 times and better improvement for both performance improvement and energy reduction compared with the pure software implementation for the convolution stage and the SIFT algorithm, respectively.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"30 1","pages":"91-98"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75534152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

FPGA-based high throughput XTS-AES encryption/decryption for storage area network 基于fpga的存储区域网络高吞吐量XTS-AES加密/解密

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082791

Yi (Estelle) Wang, Akash Kumar, Yajun Ha

{"title":"FPGA-based high throughput XTS-AES encryption/decryption for storage area network","authors":"Yi (Estelle) Wang, Akash Kumar, Yajun Ha","doi":"10.1109/FPT.2014.7082791","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082791","url":null,"abstract":"The key issue to improve the performance for secure large-scale Storage Area Network (SAN) applications lies in the speed of its encryption/decryption module. Software-based encryption/decryption cannot meet throughput requirements. To solve this problem, we propose a FPGA-based XTS-AES encryption/decryption to suit the needs for secure SAN applications with high throughput requirements. Besides throughput, area optimization is also considered in this proposed design. First, we reuse the same AES encryption to produce the tweak value and unify the operations of AES encryption/decryption in XTS-AES encryption/decryption. Second, we transfer the computations of AES encryption/decryption from GF(28) to GF(24)2, which enables us move the map and the inverse map functions outside the AES round. Third, we propose to support the SubBytes and the inverse SubBytes by the same hardware component. Finally, pipelined registers have been inserted into the proposed unrolled architecture for XTS-AES encryption/decryption. The experiments show that the proposed design achieves 36.2 Gbits/s throughput using 6784 slices on XC6VLX240T FPGA.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"322 1","pages":"268-271"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76293414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7