2018 International Conference on Field-Programmable Technology (FPT)最新文献_第2页

Investigating How Hardware Architectures are Expressed in High-Level Languages for an SKA Algorithm 研究硬件架构如何用SKA算法的高级语言表达

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00059

K. Sherwin, B. Stappers, P. Thiagaraj, K. Wang, O. Sinnen

引用次数: 0

MultiMQC: A Multilevel Message Queuing Cache Combining In-NIC and In-Kernel Memories multiqc:结合了网卡内和内核内内存的多级消息队列缓存

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00029

Koya Mitsuzuka, Yuta Tokusashi, Hiroki Matsutani

{"title":"MultiMQC: A Multilevel Message Queuing Cache Combining In-NIC and In-Kernel Memories","authors":"Koya Mitsuzuka, Yuta Tokusashi, Hiroki Matsutani","doi":"10.1109/FPT.2018.00029","DOIUrl":"https://doi.org/10.1109/FPT.2018.00029","url":null,"abstract":"Message queuing systems that deliver messages from publishers to subscribers play an important role to collect data from IoT devices. Traditional message queuing systems have improved their performance in the context of transferring log data from publishers such as Web servers to subscribers that analyze the log data. In this case, both publishers and subscribers have been assumed to have enough buffer capacity and can transfer data as jumbo frame packets for high efficiency. In recent IoT applications, however, publishers are small sensors or edge devices with low-power processors and limited memory capacity. Vast numbers of such publishers produce relatively small packets. Such a lot of small messages significantly decrease the efficiency of conventional message queuing systems. To address this issue, a dedicated message queuing logic can be implemented on FPGA-based network interface card (FPGA NIC). However, a serious issue of such in-NIC approach is a limited memory capacity on the FPGA NIC. To handle message overflow of the in-NIC cache, in this paper, it is combined with a large in-kernel software cache. More specifically, we propose a multilevel message queuing cache combining in-NIC and in-kernel memories, called MultiMQC. The multilevel cache improves the read performance. Regarding the write performance, MultiMQC introduces a batch transfer that packs small incoming messages into a single batch. We implemented MultiMQC using NetFPGA-SUME board as in-NIC cache and Linux Netfilter framework as in-kernel cache. The experimental results demonstrate that the write throughput is increased in proportion to the batch size. When pull requests hit in the in-NIC cache, the read throughput reaches 95.8% of 10GbE line rate in four 10GbE interfaces.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130358727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Secure Hardware Kernels Execution in CPU+FPGA Heterogeneous Cloud 安全硬件内核在CPU+FPGA异构云中的执行

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00035

Festus Hategekimana, Joel Mandebi Mbongue, Md Jubaer Hossain Pantho, C. Bobda

引用次数: 13

A Platform on All-Programmable SoC for Micro Autonomous Robots 微自主机器人全可编程SoC平台

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00085

Yuya Kudo, A. Takada, S. Tsuda, Takumi Sakai, T. Izumi

引用次数: 2

AConFPGA: A Multiple-Output Boolean Function Approximation DSE Technique Targeting FPGAs 一种针对fpga的多输出布尔函数逼近DSE技术

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00065

Jorge Echavarria, S. Wildermann, J. Teich

引用次数: 1

An Accelerated OpenVX Overlay for Pure Software Programmers 面向纯软件程序员的加速OpenVX覆盖

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00056

Hossein Omidian, N. Ivanov, G. Lemieux

{"title":"An Accelerated OpenVX Overlay for Pure Software Programmers","authors":"Hossein Omidian, N. Ivanov, G. Lemieux","doi":"10.1109/FPT.2018.00056","DOIUrl":"https://doi.org/10.1109/FPT.2018.00056","url":null,"abstract":"This paper presents an FPGA-based overlay for accelerating computer vision applications written in OpenVX. A software programmer simply writes an application using the standard OpenVX API. The OpenVX overlay consists of an architecture and a runtime system that runs any OpenVX application, unmodified, in an accelerated manner on an FPGA. The architecture uses a Soft Vector Processor (SVP) for general acceleration, and a library of Vector Custom Instructions (VCIs) to further accelerate specific OpenVX kernels in the FPGA fabric. The VCIs are predesigned in advance by a skilled FPGA designer. The runtime system analyzes the OpenVX computational graph and selects some kernel nodes to be executed by VCIs, with the remaining kernel nodes to be executed by the SVP. In making the selection, the runtime system uses an optimization algorithm and relies upon bitstream relocation and bitstream merging to fit multiple VCIs into a single, fixedsize Partially Reconfigurable Region (PRR). The optimization algorithm must select the VCIs that satisfy the area constraint of the PRR and give the best overall application acceleration. For example, on a Canny-blur OpenVX application, an 8-lane SVP achieves speedup of 5.3 over the hard ARM Cortex-A9. Selecting some nodes as VCIs provides another 3.5 times speedup, for an overall speedup of 18.5. The overlay enables OpenVX programmers with no FPGA design knowledge to accelerate their application.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122018901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Development of an FPGA Controlled "Mini-Car" Toward Autonomous Driving 面向自动驾驶的FPGA控制“迷你车”的研制

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00084

Musashi Aoto, Y. Wada, Yousuke Numata

引用次数: 3

A Study on Introducing FPGA to ROS Based Autonomous Driving System FPGA在ROS自动驾驶系统中的应用研究

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00090

Yasuhiro Nitta, Sou Tamura, Hideki Takase

引用次数: 15

QoS-Aware Cross-Layer Reliability-Integrated FPGA-Based Dynamic Partially Reconfigurable System Partitioning 基于qos感知的跨层可靠性集成fpga动态部分可重构系统分区

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00041

Siva Satyendra Sahoo, T. D. A. Nguyen, B. Veeravalli, Akash Kumar

{"title":"QoS-Aware Cross-Layer Reliability-Integrated FPGA-Based Dynamic Partially Reconfigurable System Partitioning","authors":"Siva Satyendra Sahoo, T. D. A. Nguyen, B. Veeravalli, Akash Kumar","doi":"10.1109/FPT.2018.00041","DOIUrl":"https://doi.org/10.1109/FPT.2018.00041","url":null,"abstract":"Dynamic Partial Reconfiguration (DPR) can be used for time-sharing of computing resources within Partially Reconfigurable Regions (PRRs) in FPGA-based systems. The heterogeneous partitioning in such systems allows the user to exploit the application-specific mapping of Partially Reconfigurable Modules (PRMs) to PRRs to implement more efficient designs. It offers increased opportunities in optimizing the reliability of the system across multiple layers - from the low-level physical one to the higher application layer. This method, called cross-layer reliability, can potentially exploit the application-specific tolerances to the quality of service (QoS) to tackle the increasing device fault-rates more cost-effectively by distributing the fault-mitigation to different layers. In this work, we propose a QoS-aware cross-layer reliability-integrated design methodology for FPGA-based DPR systems. Specifically, our methodology analyzes the requirements of the applications in terms of Functional Reliability, System Lifetime and Makespan to determine the best possible combinations of reliability-oriented design choices in different layers. We report up to an average of 24% and 30% performance improvements for single and multi-objective optimization-based system partitioning.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122308583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Application Acceleration on FPGAs with OmpSs@FPGA fpga上的应用加速与OmpSs@FPGA

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00021

Jaume Bosch, Xubin Tan, Antonio Filgueras, Miquel Vidal Piñol, Marc Mateu, Daniel Jiménez-González, C. Álvarez, X. Martorell, E. Ayguadé, Jesús Labarta

{"title":"Application Acceleration on FPGAs with OmpSs@FPGA","authors":"Jaume Bosch, Xubin Tan, Antonio Filgueras, Miquel Vidal Piñol, Marc Mateu, Daniel Jiménez-González, C. Álvarez, X. Martorell, E. Ayguadé, Jesús Labarta","doi":"10.1109/FPT.2018.00021","DOIUrl":"https://doi.org/10.1109/FPT.2018.00021","url":null,"abstract":"OmpSs@FPGA is the flavor of OmpSs that allows offloading application functionality to FPGAs. Similarly to OpenMP, it is based on compiler directives. While the OpenMP specification also includes support for heterogeneous execution, we use OmpSs and OmpSs@FPGA as prototype implementation to develop new ideas for OpenMP. OmpSs@FPGA implements the tasking model with runtime support to automatically exploit all SMP and FPGA resources available in the execution platform. In this paper, we present the OmpSs@FPGA ecosystem, based on the Mercurium compiler and the Nanos++ runtime system. We show how the applications are transformed to run on the SMP cores and the FPGA. The application kernels defined as tasks to be accelerated, using the OmpSs directives are: 1) transformed by the compiler into kernels connected with the proper synchronization and communication ports, 2) extracted to intermediate files, 3) compiled through the FPGA vendor HLS tool, and 4) used to configure the FPGA. Our Nanos++ runtime system schedules the application tasks on the platform, being able to use the SMP cores and the FPGA accelerators at the same time. We present the evaluation of the OmpSs@FPGA environment with the Matrix Multiplication, Cholesky and N-Body benchmarks, showing the internal details of the execution, and the performance obtained on a Zynq Ultrascale+ MPSoC (up to 128x). The source code uses OmpSs@FPGA annotations and different Vivado HLS optimization directives are applied for acceleration.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"30 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126090484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21