2018 International Conference on Field-Programmable Technology (FPT)最新文献

筛选
英文 中文
Investigating How Hardware Architectures are Expressed in High-Level Languages for an SKA Algorithm 研究硬件架构如何用SKA算法的高级语言表达
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00059
K. Sherwin, B. Stappers, P. Thiagaraj, K. Wang, O. Sinnen
{"title":"Investigating How Hardware Architectures are Expressed in High-Level Languages for an SKA Algorithm","authors":"K. Sherwin, B. Stappers, P. Thiagaraj, K. Wang, O. Sinnen","doi":"10.1109/FPT.2018.00059","DOIUrl":"https://doi.org/10.1109/FPT.2018.00059","url":null,"abstract":"High-level approaches to hardware development can expedite the design process, allowing for rapid design space exploration. However, in order to generate optimised solutions expert intervention is often still required. This work seeks to explore the relationship between high-level descriptions and the resulting hardware architecture. This aims to reduce the barrier to entry for software developers (without hardware expertise) to produce optimised hardware designs through application of classical loop optimisation techniques. An algorithm from the Square Kilometre Array (SKA) is chosen to demonstrate the effects of such changes in a real world, real-time application requiring high throughput and low power consumption, taking a systematic approach in order to achieve an optimised result. A systolic array design is also discussed and compared with the software style changes. The Intel FPGA SDK for OpenCL (AOCL) Offline Compiler (AOC) is used here for verification and synthesis of the designs being examined, targeting an Arria-10 FPGA accelerator.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114138423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MultiMQC: A Multilevel Message Queuing Cache Combining In-NIC and In-Kernel Memories multiqc:结合了网卡内和内核内内存的多级消息队列缓存
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00029
Koya Mitsuzuka, Yuta Tokusashi, Hiroki Matsutani
{"title":"MultiMQC: A Multilevel Message Queuing Cache Combining In-NIC and In-Kernel Memories","authors":"Koya Mitsuzuka, Yuta Tokusashi, Hiroki Matsutani","doi":"10.1109/FPT.2018.00029","DOIUrl":"https://doi.org/10.1109/FPT.2018.00029","url":null,"abstract":"Message queuing systems that deliver messages from publishers to subscribers play an important role to collect data from IoT devices. Traditional message queuing systems have improved their performance in the context of transferring log data from publishers such as Web servers to subscribers that analyze the log data. In this case, both publishers and subscribers have been assumed to have enough buffer capacity and can transfer data as jumbo frame packets for high efficiency. In recent IoT applications, however, publishers are small sensors or edge devices with low-power processors and limited memory capacity. Vast numbers of such publishers produce relatively small packets. Such a lot of small messages significantly decrease the efficiency of conventional message queuing systems. To address this issue, a dedicated message queuing logic can be implemented on FPGA-based network interface card (FPGA NIC). However, a serious issue of such in-NIC approach is a limited memory capacity on the FPGA NIC. To handle message overflow of the in-NIC cache, in this paper, it is combined with a large in-kernel software cache. More specifically, we propose a multilevel message queuing cache combining in-NIC and in-kernel memories, called MultiMQC. The multilevel cache improves the read performance. Regarding the write performance, MultiMQC introduces a batch transfer that packs small incoming messages into a single batch. We implemented MultiMQC using NetFPGA-SUME board as in-NIC cache and Linux Netfilter framework as in-kernel cache. The experimental results demonstrate that the write throughput is increased in proportion to the batch size. When pull requests hit in the in-NIC cache, the read throughput reaches 95.8% of 10GbE line rate in four 10GbE interfaces.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130358727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Secure Hardware Kernels Execution in CPU+FPGA Heterogeneous Cloud 安全硬件内核在CPU+FPGA异构云中的执行
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00035
Festus Hategekimana, Joel Mandebi Mbongue, Md Jubaer Hossain Pantho, C. Bobda
{"title":"Secure Hardware Kernels Execution in CPU+FPGA Heterogeneous Cloud","authors":"Festus Hategekimana, Joel Mandebi Mbongue, Md Jubaer Hossain Pantho, C. Bobda","doi":"10.1109/FPT.2018.00035","DOIUrl":"https://doi.org/10.1109/FPT.2018.00035","url":null,"abstract":"In this paper, we present a new security framework which allows controlled sharing and isolated execution of mutually distrusted FPGA-accelerators in heterogeneous cloud systems. The proposed framework enables the accelerators running in FPGAs in cloud computers to transparently inherit at run-time, software security policies of the virtual machines processes calling them. This capability allows system security policies enforcement mechanism to propagate access control privilege boundaries expressed at the hypervisor level, down to individual FPGA-accelerators. Furthermore, we present a software/hardware prototype implementation of the proposed security framework, showing that it can easily be transparently integrated within the virtual machine software stacks that run in today's cloud-based systems. Experimentation results show our proposed framework provides secure hardware execution with negligible execution overhead on guest VMs applications.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115575977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
A Platform on All-Programmable SoC for Micro Autonomous Robots 微自主机器人全可编程SoC平台
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00085
Yuya Kudo, A. Takada, S. Tsuda, Takumi Sakai, T. Izumi
{"title":"A Platform on All-Programmable SoC for Micro Autonomous Robots","authors":"Yuya Kudo, A. Takada, S. Tsuda, Takumi Sakai, T. Izumi","doi":"10.1109/FPT.2018.00085","DOIUrl":"https://doi.org/10.1109/FPT.2018.00085","url":null,"abstract":"We present a platform on all-programmable SoC for micro autonomous robots for probing, exploring, rescuing, etc. Contrast to challenges for auto drive cars having relatively rich power supply and high-performance computing platform, our challenge is to develop technologies for autonomous robots with tight restrictions on size, weight, and energy consumption. We utilize all-programmable SoCs for the purpose and develop a system including camera interface, image processing, recognition, action planning, and motor control. The key techniques are to optimize the dataflow between software (and main memory) and hardware for efficiency and to adopt a standard stream interface in hardware modules for productivity. The system can be utilized as a common platform for micro autonomous robots. The system is implemented as a robot car named ZybotR2-Z2 and achieves 3 to 37 frame/sec image recognition and car control with single Zynq-7020 device.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114291013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
AConFPGA: A Multiple-Output Boolean Function Approximation DSE Technique Targeting FPGAs 一种针对fpga的多输出布尔函数逼近DSE技术
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00065
Jorge Echavarria, S. Wildermann, J. Teich
{"title":"AConFPGA: A Multiple-Output Boolean Function Approximation DSE Technique Targeting FPGAs","authors":"Jorge Echavarria, S. Wildermann, J. Teich","doi":"10.1109/FPT.2018.00065","DOIUrl":"https://doi.org/10.1109/FPT.2018.00065","url":null,"abstract":"New relaxed quality standards laid down by approximate computing enrich the design pool with architectures dissipating less power, consuming fewer resources or with smaller latencies. In LUT-based FPGA logic approximation, the number of LUTs and latency associated to a design can be optimized by allowing the approximation of circuit results. In this paper, we present techniques for automatic design space exploration (DSE) of Boolean function falsifications and the ability and impact to reduce resources usage as well as the length of critical paths on LUT-based FPGAs. Our experiments give evidence that resource reductions of about 20% are easily achievable for error rates amounting to less than 0.05% w.r.t. accurate designs.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117297865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Accelerated OpenVX Overlay for Pure Software Programmers 面向纯软件程序员的加速OpenVX覆盖
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00056
Hossein Omidian, N. Ivanov, G. Lemieux
{"title":"An Accelerated OpenVX Overlay for Pure Software Programmers","authors":"Hossein Omidian, N. Ivanov, G. Lemieux","doi":"10.1109/FPT.2018.00056","DOIUrl":"https://doi.org/10.1109/FPT.2018.00056","url":null,"abstract":"This paper presents an FPGA-based overlay for accelerating computer vision applications written in OpenVX. A software programmer simply writes an application using the standard OpenVX API. The OpenVX overlay consists of an architecture and a runtime system that runs any OpenVX application, unmodified, in an accelerated manner on an FPGA. The architecture uses a Soft Vector Processor (SVP) for general acceleration, and a library of Vector Custom Instructions (VCIs) to further accelerate specific OpenVX kernels in the FPGA fabric. The VCIs are predesigned in advance by a skilled FPGA designer. The runtime system analyzes the OpenVX computational graph and selects some kernel nodes to be executed by VCIs, with the remaining kernel nodes to be executed by the SVP. In making the selection, the runtime system uses an optimization algorithm and relies upon bitstream relocation and bitstream merging to fit multiple VCIs into a single, fixedsize Partially Reconfigurable Region (PRR). The optimization algorithm must select the VCIs that satisfy the area constraint of the PRR and give the best overall application acceleration. For example, on a Canny-blur OpenVX application, an 8-lane SVP achieves speedup of 5.3 over the hard ARM Cortex-A9. Selecting some nodes as VCIs provides another 3.5 times speedup, for an overall speedup of 18.5. The overlay enables OpenVX programmers with no FPGA design knowledge to accelerate their application.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122018901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Development of an FPGA Controlled "Mini-Car" Toward Autonomous Driving 面向自动驾驶的FPGA控制“迷你车”的研制
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00084
Musashi Aoto, Y. Wada, Yousuke Numata
{"title":"Development of an FPGA Controlled \"Mini-Car\" Toward Autonomous Driving","authors":"Musashi Aoto, Y. Wada, Yousuke Numata","doi":"10.1109/FPT.2018.00084","DOIUrl":"https://doi.org/10.1109/FPT.2018.00084","url":null,"abstract":"We are developing an FPGA controlled \"Mini-Car\" for FPT'18 design competition toward realizing an autonomous driving car. In the competition, we need to realize fundamental techniques like localization and path planning, while employing road lane detection, traffic signals detection, and other objective detection methods. In this paper, we summarize our development plan of our Mini-Car to realize an autonomous driving techniques based on the regulations of the competition.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125872132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Study on Introducing FPGA to ROS Based Autonomous Driving System FPGA在ROS自动驾驶系统中的应用研究
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00090
Yasuhiro Nitta, Sou Tamura, Hideki Takase
{"title":"A Study on Introducing FPGA to ROS Based Autonomous Driving System","authors":"Yasuhiro Nitta, Sou Tamura, Hideki Takase","doi":"10.1109/FPT.2018.00090","DOIUrl":"https://doi.org/10.1109/FPT.2018.00090","url":null,"abstract":"We are developing an autonomous driving robot using programmable SoC. The robot under development does not communicate with the external PC and performs all judgment and control on the board mounted on the robot. We aim to realize a built-in autonomous driving system with low power consumption and high performance by offloading high-load processing with the FPGA. At present, it is used only for acquiring camera images on the FPGA, but we are planning to do hardware implementation of the system constructed by software. In addition, we used ROS (Robot Operating System) to construct the robot's autonomous driving system, and the components to be developed are reusable. This document describes the detailed configuration and future prospect of the robot currently under development.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"39 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131138604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
QoS-Aware Cross-Layer Reliability-Integrated FPGA-Based Dynamic Partially Reconfigurable System Partitioning 基于qos感知的跨层可靠性集成fpga动态部分可重构系统分区
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00041
Siva Satyendra Sahoo, T. D. A. Nguyen, B. Veeravalli, Akash Kumar
{"title":"QoS-Aware Cross-Layer Reliability-Integrated FPGA-Based Dynamic Partially Reconfigurable System Partitioning","authors":"Siva Satyendra Sahoo, T. D. A. Nguyen, B. Veeravalli, Akash Kumar","doi":"10.1109/FPT.2018.00041","DOIUrl":"https://doi.org/10.1109/FPT.2018.00041","url":null,"abstract":"Dynamic Partial Reconfiguration (DPR) can be used for time-sharing of computing resources within Partially Reconfigurable Regions (PRRs) in FPGA-based systems. The heterogeneous partitioning in such systems allows the user to exploit the application-specific mapping of Partially Reconfigurable Modules (PRMs) to PRRs to implement more efficient designs. It offers increased opportunities in optimizing the reliability of the system across multiple layers - from the low-level physical one to the higher application layer. This method, called cross-layer reliability, can potentially exploit the application-specific tolerances to the quality of service (QoS) to tackle the increasing device fault-rates more cost-effectively by distributing the fault-mitigation to different layers. In this work, we propose a QoS-aware cross-layer reliability-integrated design methodology for FPGA-based DPR systems. Specifically, our methodology analyzes the requirements of the applications in terms of Functional Reliability, System Lifetime and Makespan to determine the best possible combinations of reliability-oriented design choices in different layers. We report up to an average of 24% and 30% performance improvements for single and multi-objective optimization-based system partitioning.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122308583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Application Acceleration on FPGAs with OmpSs@FPGA fpga上的应用加速与OmpSs@FPGA
2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI: 10.1109/FPT.2018.00021
Jaume Bosch, Xubin Tan, Antonio Filgueras, Miquel Vidal Piñol, Marc Mateu, Daniel Jiménez-González, C. Álvarez, X. Martorell, E. Ayguadé, Jesús Labarta
{"title":"Application Acceleration on FPGAs with OmpSs@FPGA","authors":"Jaume Bosch, Xubin Tan, Antonio Filgueras, Miquel Vidal Piñol, Marc Mateu, Daniel Jiménez-González, C. Álvarez, X. Martorell, E. Ayguadé, Jesús Labarta","doi":"10.1109/FPT.2018.00021","DOIUrl":"https://doi.org/10.1109/FPT.2018.00021","url":null,"abstract":"OmpSs@FPGA is the flavor of OmpSs that allows offloading application functionality to FPGAs. Similarly to OpenMP, it is based on compiler directives. While the OpenMP specification also includes support for heterogeneous execution, we use OmpSs and OmpSs@FPGA as prototype implementation to develop new ideas for OpenMP. OmpSs@FPGA implements the tasking model with runtime support to automatically exploit all SMP and FPGA resources available in the execution platform. In this paper, we present the OmpSs@FPGA ecosystem, based on the Mercurium compiler and the Nanos++ runtime system. We show how the applications are transformed to run on the SMP cores and the FPGA. The application kernels defined as tasks to be accelerated, using the OmpSs directives are: 1) transformed by the compiler into kernels connected with the proper synchronization and communication ports, 2) extracted to intermediate files, 3) compiled through the FPGA vendor HLS tool, and 4) used to configure the FPGA. Our Nanos++ runtime system schedules the application tasks on the platform, being able to use the SMP cores and the FPGA accelerators at the same time. We present the evaluation of the OmpSs@FPGA environment with the Matrix Multiplication, Cholesky and N-Body benchmarks, showing the internal details of the execution, and the performance obtained on a Zynq Ultrascale+ MPSoC (up to 128x). The source code uses OmpSs@FPGA annotations and different Vivado HLS optimization directives are applied for acceleration.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"30 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126090484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信