2019 International Conference on Field-Programmable Technology (ICFPT)最新文献_第6页

Real-Time Automatic Modulation Classification 实时自动调制分类

2019 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2019-12-01 DOI: 10.1109/ICFPT47387.2019.00052

Stephen Tridgell, D. Boland, P. Leong, Siddhartha

引用次数: 6

FPGA-Based Object Detection for Autonomous Driving System 基于fpga的自动驾驶系统目标检测

2019 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2019-12-01 DOI: 10.1109/ICFPT47387.2019.00094

K. Harada, K. Kanazawa, M. Yasunaga

引用次数: 4

Dependency-Aware Clustering for Variable-Grained Hardware-Software Partitioning 面向可变粒度软硬件分区的依赖感知聚类

2019 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2019-12-01 DOI: 10.1109/ICFPT47387.2019.00080

Deshya Wijesundera, Nadeeshan D. K. Dissanayake, Alok Prakash, T. Srikanthan, Damith Anhettigama

引用次数: 0

An Open-Source Lightweight Timing Model for RapidWright RapidWright的开源轻量级计时模型

2019 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2019-12-01 DOI: 10.1109/ICFPT47387.2019.00028

P. Maidee, Christopher E. Neely, A. Kaviani, C. Lavin

引用次数: 4

SkyCastle: A Resource-Aware Multi-Loop Scheduler for High-Level Synthesis SkyCastle:用于高级合成的资源感知多循环调度程序

2019 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2019-12-01 DOI: 10.1109/ICFPT47387.2019.00013

J. Oppermann, Lukas Sommer, Lukas Weber, Melanie Reuter-Oppermann, A. Koch, O. Sinnen

{"title":"SkyCastle: A Resource-Aware Multi-Loop Scheduler for High-Level Synthesis","authors":"J. Oppermann, Lukas Sommer, Lukas Weber, Melanie Reuter-Oppermann, A. Koch, O. Sinnen","doi":"10.1109/ICFPT47387.2019.00013","DOIUrl":"https://doi.org/10.1109/ICFPT47387.2019.00013","url":null,"abstract":"A common optimisation problem in the high-level synthesis (HLS) of FPGA-based accelerators is to find a microarchitecture that maximises the performance while keeping the utilisation of the device's low-level resources below certain limits. We propose to tackle it directly as part of the HLS scheduler. To that end, we formalise a general, integrated scheduling and allocation problem for HLS kernels, and present SkyCastle, a novel resource-aware multi-loop scheduler using integer linear programming to solve it for a subclass of kernels composed of multiple, nested loops. In order to demonstrate the practical applicability of the approach, we model the scheduler in such a way as to be plug-in compatible with the Xilinx Vivado HLS engine, allowing the computed solutions to be fed back into its synthesis flow. We evaluate SkyCastle for three non-trivial kernels from the machine learning, signal processing, and physical simulation domains, on two FPGA devices. Additionally, we investigate the replication of slightly slower, but smaller accelerators as a means to further boost the overall performance. In contrast to Vivado HLS' default settings, which aim at maximum performance but may fail in later synthesis steps, the solutions computed by our scheduler always result in synthesisable designs.","PeriodicalId":241340,"journal":{"name":"2019 International Conference on Field-Programmable Technology (ICFPT)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128815431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A Study on Switch Block Patterns for Tileable FPGA Routing Architectures 可平铺FPGA路由架构的开关块模式研究

2019 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2019-12-01 DOI: 10.1109/ICFPT47387.2019.00039

Xifan Tang, Edouard Giacomin, Aurélien Alacchi, P. Gaillardon

{"title":"A Study on Switch Block Patterns for Tileable FPGA Routing Architectures","authors":"Xifan Tang, Edouard Giacomin, Aurélien Alacchi, P. Gaillardon","doi":"10.1109/ICFPT47387.2019.00039","DOIUrl":"https://doi.org/10.1109/ICFPT47387.2019.00039","url":null,"abstract":"Following the rapid growth of Field Programmable Gate Arrays (FPGAs) sizes, the regularity of architectures has become a critical feature, leading to the development of millionof-LUT devices. While the routing architecture plays a dominant role in the area, delay and power of modern FPGAs, most of previously published works focus on improving the routability and performance of FPGAs while very few studied tileable (highly-regular) routing architectures. In this paper, we provide a detailed analysis between tileable and popular nontileable FPGAs considering modern routing architectures. First, we upgrade VPR to generate tileable routing architecture, which can support different switch block patterns for (1) the routing tracks that start/end in a tile and (2) the routing tracks that pass through a tile. Then, we evaluate the performance of mixed switch blocks patterns in the context of a Stratix IV-like FPGA architecture, by considering the most representative patterns, i.e., Subset, Universal and Wilton. Experimental results show that averaged over the MCNC and VTR benchmarks, when compared to the well-optimized non-tileable architectures, the tileable architectures can improve the minimum routable channel width by 13% and area-delay product by 2%. In particular, our results showed that in the context of tileable FPGA, a mix of Universal and Wilton switch block patterns lead to the best tradeoff in area, delay and routability, while Wilton switch block was the best choice in non-tileable FPGAs.","PeriodicalId":241340,"journal":{"name":"2019 International Conference on Field-Programmable Technology (ICFPT)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126412802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

AutoBoxing: Improving GCC Passes to Optimize HW/SW Multi-Versioning of Kernels for HLS AutoBoxing:改进GCC通道以优化HLS的硬件/软件多版本内核

2019 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2019-12-01 DOI: 10.1109/ICFPT47387.2019.00057

Johanna Rohde, C. Hochberger

引用次数: 0

MajorityNets: BNNs Utilising Approximate Popcount for Improved Efficiency MajorityNets：利用近似人口数量提高效率的 BNNs

2019 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2019-12-01 DOI: 10.1109/ICFPT47387.2019.00062

Seyedramin Rasoulinezhad, Sean Fox, Hao Zhou, Lingli Wang, D. Boland, P. Leong

引用次数: 2

Power-Aware FPGA Mapping of Convolutional Neural Networks 卷积神经网络的功耗感知FPGA映射

2019 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2019-12-01 DOI: 10.1109/ICFPT47387.2019.00059

Alexander Montgomerie-Corcoran, Stylianos I. Venieris, C. Bouganis

{"title":"Power-Aware FPGA Mapping of Convolutional Neural Networks","authors":"Alexander Montgomerie-Corcoran, Stylianos I. Venieris, C. Bouganis","doi":"10.1109/ICFPT47387.2019.00059","DOIUrl":"https://doi.org/10.1109/ICFPT47387.2019.00059","url":null,"abstract":"With an unprecedented accuracy in numerous AI tasks, convolutional neural networks (CNNs) are rapidly deployed on power-limited mobile and embedded applications. Existing mapping approaches focus on achieving high performance without explicit consideration of power consumption, leading to suboptimal solutions when power is considered in a subsequent stage. In this context, there is an emerging need for power-aware methodologies for the design of custom CNN engines. In this work, a methodology is presented for modelling the power consumption of FPGA-based CNN accelerators using a high-level description of modules, together with a power-centric search strategy for exploring power-performance trade-offs within the CNN-to-FPGA design space. By integrating into an existing CNN-to-FPGA toolflow, the proposed power estimation method can yield a prediction accuracy of 93.4% for total system power consumption. Furthermore, it is demonstrated that the associated power-oriented exploration approach can generate CNN accelerators with a 20.1% power reduction over a purely throughput-driven design for AlexNet, maintaining the design's throughput.","PeriodicalId":241340,"journal":{"name":"2019 International Conference on Field-Programmable Technology (ICFPT)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133329208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Enhanced Heterogeneous Cloud: Transparent Acceleration and Elasticity 增强异构云:透明加速和弹性

2019 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2019-12-01 DOI: 10.1109/ICFPT47387.2019.00027

Jessica Vandebon, J. Coutinho, W. Luk, E. Nurvitadhi, Mishali Naik

{"title":"Enhanced Heterogeneous Cloud: Transparent Acceleration and Elasticity","authors":"Jessica Vandebon, J. Coutinho, W. Luk, E. Nurvitadhi, Mishali Naik","doi":"10.1109/ICFPT47387.2019.00027","DOIUrl":"https://doi.org/10.1109/ICFPT47387.2019.00027","url":null,"abstract":"This paper presents ORIAN, a fully-managed Platform-as-a-Service (PaaS) for deploying high-level applications onto large-scale heterogeneous cloud infrastructures. We aim to make specialised, accelerator resources in the cloud accessible to software developers by extending the traditional homogeneous PaaS execution model to support automatic runtime management of heterogeneous compute resources such as CPUs and FPGAs. In particular, we focus on two mechanisms: transparent acceleration, which automatically maps jobs to the most suitable resource configuration, and heterogeneous elasticity, which performs automatic vertical (type) and horizontal (quantity) scaling of provisioned resources to guarantee QoS (Quality of Service) objectives while minimising cost. We develop a prototype to validate our approach, targeting a hardware platform with combined computational capacity of 28 FPGAs and 36 CPU cores, and evaluate it using case studies in three application domains: machine learning, bioinformatics, and physics. Our transparent acceleration decisions achieve on average 96% of the maximum manually identified static configuration throughput for large workloads, while removing the burden of determining configuration from the user; an elastic ORIAN resource group provides a 2.3 times cost reduction compared to an over-provisioned group for non-uniform, peaked job sequences while guaranteeing QoS objectives; and our malleable architecture extends to support a new, more suitable resource type, automatically reducing the cost by half while maintaining throughput, and achieving a 23% throughput increase while fulfilling resource constraints.","PeriodicalId":241340,"journal":{"name":"2019 International Conference on Field-Programmable Technology (ICFPT)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133384287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4