2022 International Conference on Field-Programmable Technology (ICFPT)最新文献

筛选
英文 中文
Area-Efficient Memory Scheduling for Dynamically Scheduled High-Level Synthesis 动态调度高级综合的区域高效内存调度
2022 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2022-12-05 DOI: 10.1109/ICFPT56656.2022.9974262
Xue-Xin He, Jianyi Cheng, G. Constantinides
{"title":"Area-Efficient Memory Scheduling for Dynamically Scheduled High-Level Synthesis","authors":"Xue-Xin He, Jianyi Cheng, G. Constantinides","doi":"10.1109/ICFPT56656.2022.9974262","DOIUrl":"https://doi.org/10.1109/ICFPT56656.2022.9974262","url":null,"abstract":"In high-level synthesis, scheduling maps operations into clock cycles. It can either be done at compile time (statically) or run time (dynamically). There has been recent interests in dynamic scheduling as it can potentially achieve a better performance. The state-of-the-art dynamically scheduled HLS tool Dynamatic generates dataflow-style hardware in a netlist of pre-defined components connected using handshake signals. The memory operations are executed by a component named load-store queue (LSQ), which can achieve run-time out-of-order memory accesses for high performance. However, the additional logic for the LSQ leads to significant area overhead compared to static scheduling. In this paper, we propose an area-efficient approach for scheduling memory operations at run time. We approximate the memory dependence distance to its minimal value and efficiently parallelise memory accesses in dynamically scheduled hardware. Over several benchmarks from related works, our results show that our approach achieves on average $0.2times$ of the area-delay product compared to the original designs using LSQs.","PeriodicalId":239314,"journal":{"name":"2022 International Conference on Field-Programmable Technology (ICFPT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130236761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Reinforcement Learning Framework for Automated Logic Synthesis Exploration 用于自动逻辑综合探索的高效强化学习框架
2022 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2022-12-05 DOI: 10.1109/ICFPT56656.2022.9974330
Yu Qian, Xuegong Zhou, Hao Zhou, Lingli Wang
{"title":"Efficient Reinforcement Learning Framework for Automated Logic Synthesis Exploration","authors":"Yu Qian, Xuegong Zhou, Hao Zhou, Lingli Wang","doi":"10.1109/ICFPT56656.2022.9974330","DOIUrl":"https://doi.org/10.1109/ICFPT56656.2022.9974330","url":null,"abstract":"Logic synthesis is a crucial step in electronic design automation tools for integrated circuit design. In recent years, the development of reinforcement learning (RL) has enabled the designers to automatically explore the logic synthesis process. Existing RL based methods typically use conventional on-policy models, which leads to data inefficiency. Moreover, the exploration approach for FPGA technology mapping in recent works lacks the flexibility of the learning process. In this work, we propose ESE, a reinforcement learning based framework to efficiently learn the logic synthesis process. The framework supports the modeling for both the logic optimization and the FPGA technology mapping. The reward functions and terminal conditions in the RL environment are designed to efficiently guide the optimization of the metrics and execution time. For the modeling of FPGA mapping, the logic optimization and technology mapping are combined to be learned in a flexible way. Moreover, the Proximal Policy Optimization model is adopted to improve the utilization of samples. The proposed framework is evaluated on several common benchmarks. For the logic optimization on the EPFL benchmark, compared with previous works, the proposed method obtains an 11.3% improvement in the average quality (node-level-product) and reduces the execution time by 13.7%. For the FPGA technology mapping on the VTR benchmark, our method improves the average quality (LUT-level-product) by 14.8%, and reduces the execution time by 14.4% compared with the recent work.","PeriodicalId":239314,"journal":{"name":"2022 International Conference on Field-Programmable Technology (ICFPT)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127765163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating Transformer Neural Networks on FPGAs for High Energy Physics Experiments 用于高能物理实验的fpga加速变压器神经网络
2022 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2022-12-05 DOI: 10.1109/ICFPT56656.2022.9974463
Filip Wojcicki, Zhiqiang Que, A. Tapper, W. Luk
{"title":"Accelerating Transformer Neural Networks on FPGAs for High Energy Physics Experiments","authors":"Filip Wojcicki, Zhiqiang Que, A. Tapper, W. Luk","doi":"10.1109/ICFPT56656.2022.9974463","DOIUrl":"https://doi.org/10.1109/ICFPT56656.2022.9974463","url":null,"abstract":"High Energy Physics studies the fundamental forces and elementary particles of the Universe. With the unprecedented scale of experiments comes the challenge of accurate, ultra-low latency decision-making. Transformer Neural Networks (TNNs) have been proven to accomplish cutting-edge accuracy in classification for hadronic jet tagging. Nevertheless, software-centered solutions targeting CPUs and GPUs lack the inference speed required for real-time particle triggers, most notably those at the CERN Large Hadron Collider. This paper proposes a novel TNN-based architecture, efficiently mapped to Field-Programmable Gate Arrays, that outperforms GPU inference capabilities involving state-of-the-art neural network models by approximately 1000 times while preserving comparable classification accuracy. The design offers high customizability and aims to bridge the gap between hardware and software development by using High-Level Synthesis. Moreover, we propose a novel model-independent post-training quantization search algorithm that works in general hardware environments according to user-defined constraints. Experimental evaluation yields a 64% reduction in overall bit-widths with a 2% accuracy loss.","PeriodicalId":239314,"journal":{"name":"2022 International Conference on Field-Programmable Technology (ICFPT)","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121093096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Cloning the Unclonable: Physically Cloning an FPGA Ring-Oscillator PUF 克隆不可克隆:物理克隆FPGA环形振荡器PUF
2022 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2022-12-05 DOI: 10.1109/ICFPT56656.2022.9974597
Hayden Cook, Jonathan Thompson, Zephram Tripp, B. Hutchings, Jeffrey B. Goeders
{"title":"Cloning the Unclonable: Physically Cloning an FPGA Ring-Oscillator PUF","authors":"Hayden Cook, Jonathan Thompson, Zephram Tripp, B. Hutchings, Jeffrey B. Goeders","doi":"10.1109/ICFPT56656.2022.9974597","DOIUrl":"https://doi.org/10.1109/ICFPT56656.2022.9974597","url":null,"abstract":"This work presents a novel technique to physically clone a ring oscillator physically unclonable function (RO PDF) onto another distinct FPG A die, using precise, targeted aging. The resulting cloned RO PDF provides a response that is identical to its copied FPGA counterpart, i.e., the FPGA and its clone are indistinguishable from each other. Targeted aging is achieved by: 1) heating the FPGA using bitstream-Iocated short circuits, and 2) enabling/disabling ROs in the same FPGA bitstream. During self heating caused by short-circuits contained in the FPGA bitstream, circuit areas containing oscillating ROs (enabled) degrade more slowly than circuit areas containing non-oscillating ROs (disabled), due to bias temperature instability effects. This targeted aging technique is used to swap the relative frequencies of two ROs that will, in turn, flip the corresponding bit in the PUF response. Two experiments are described. The first experiment uses targeted aging to create an FPGA that exhibits the same PUF response as another FPGA, i.e., a clone of an FPGA PUF onto another FPGA device. The second experiment demonstrates that this aging technique can create an RO PUF with any desired response.","PeriodicalId":239314,"journal":{"name":"2022 International Conference on Field-Programmable Technology (ICFPT)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115278183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Modeling FPGA-based Architectures for Robotics 基于fpga的机器人体系结构建模
2022 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2022-12-05 DOI: 10.1109/ICFPT56656.2022.9974412
Ariel Podlubne, D. Göhringer
{"title":"Modeling FPGA-based Architectures for Robotics","authors":"Ariel Podlubne, D. Göhringer","doi":"10.1109/ICFPT56656.2022.9974412","DOIUrl":"https://doi.org/10.1109/ICFPT56656.2022.9974412","url":null,"abstract":"There have been partial contributions in the state-of-the-art about FPGAs being part of robotics systems. However, a study of FPGAs as a whole for robotics systems is missing in the literature. This means that defining all the components required for an FPGA-based system for robotics applications as a whole, their integration into existing solutions, and the generation of said components has not been done. The traditional robotics workflow involves many disciplines (e.g., mechatronics, control, software) where experts deal with the integration of all individual parts. We propose a model-based component-oriented workflow, focusing on easing the integration of all the parts to deploy FPGA-based robotics systems automatically. Our systematic approach reduces ten times the effort needed to deploy a system than doing it manually. Furthermore, it converts an arduous and error-prone process of doing it manually into a simple system description.","PeriodicalId":239314,"journal":{"name":"2022 International Conference on Field-Programmable Technology (ICFPT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116457923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GraFF: A Multi-FPGA System with Memory Semantic Fabric for Scalable Graph Processing 基于记忆语义结构的多fpga可扩展图形处理系统
2022 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2022-12-05 DOI: 10.1109/ICFPT56656.2022.9974189
Xu Zhang, Yisong Chang, Tianyue Lu, Ke Liu, Ke Zhang, Mingyu Chen
{"title":"GraFF: A Multi-FPGA System with Memory Semantic Fabric for Scalable Graph Processing","authors":"Xu Zhang, Yisong Chang, Tianyue Lu, Ke Liu, Ke Zhang, Mingyu Chen","doi":"10.1109/ICFPT56656.2022.9974189","DOIUrl":"https://doi.org/10.1109/ICFPT56656.2022.9974189","url":null,"abstract":"FPGA has been a promising solution for graph processing in many scenarios. With a rapid growth in graph size, the on/off-chip memory capacity of a single FPGA is insufficient to hold large-scale graphs. To tackle such problem, in this position paper, we introduce GraFF, a Graph processing system with multiple FPGAs interconnected via a custom memory semantic Fabric. In order to efficiently exploit system parallelism, we first split the traversal of graph data into a series of independent fine-grained flits that are concurrently delivered among FPGAs as sheer memory semantic transactions. Then we relax FPGAs' synchronization from strict barrier boundaries between adjacent supersteps to fully parallelize graph traversing and computing. We build a prototype of GraFF with four custom FPGA nodes. Preliminary evaluation result based on the Breadth First Search (BFS) algorithm shows that the peak performance of GraFF reaches up to 6.23 GTEPS. Moreover, GraFF exhibits linear scalability when the number of FPGAs rises from one to four.","PeriodicalId":239314,"journal":{"name":"2022 International Conference on Field-Programmable Technology (ICFPT)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124801504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FPT 22 on Site Proceedings FPT 22现场程序
2022 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2022-12-05 DOI: 10.1109/ICFPT56656.2022.9974397
{"title":"FPT 22 on Site Proceedings","authors":"","doi":"10.1109/ICFPT56656.2022.9974397","DOIUrl":"https://doi.org/10.1109/ICFPT56656.2022.9974397","url":null,"abstract":"","PeriodicalId":239314,"journal":{"name":"2022 International Conference on Field-Programmable Technology (ICFPT)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114438143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging FPGA Primitives to Improve Word Reconstruction during Netlist Reverse Engineering 利用FPGA基元改善网表逆向工程中的字重构
2022 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2022-12-05 DOI: 10.1109/ICFPT56656.2022.9974401
Reilly McKendrick, Corey Simpson, B. Nelson, Jeffrey B. Goeders
{"title":"Leveraging FPGA Primitives to Improve Word Reconstruction during Netlist Reverse Engineering","authors":"Reilly McKendrick, Corey Simpson, B. Nelson, Jeffrey B. Goeders","doi":"10.1109/ICFPT56656.2022.9974401","DOIUrl":"https://doi.org/10.1109/ICFPT56656.2022.9974401","url":null,"abstract":"While attempting to perform hardware trojan detection, or other low-level design analyses, it is often necessary to inspect and understand the gate-level netlist of an implemented hardware design. Unfortunately this process is challenging, as at the physical level, the design does not contain any hierarchy, net names, or word groupings. Previous work has shown how gate-level netlists can be analyzed to restore high-level circuit structures, including reconstructing multi-bit signals, which aids a user in understanding the behavior of the design. In this work we explore improvements to the word reconstruction process, specific to FPGA platforms. We demonstrate how hard-block primitives in a design (carry chains, block memories, multipliers) can be leveraged to better predict which signals belong to the same words in the original design. Our technique is evaluated using the VTR benchmarks, synthesized for a 7-series Xilinx FPGA, and the results are compared to DANA, a known word reconstruction tool.","PeriodicalId":239314,"journal":{"name":"2022 International Conference on Field-Programmable Technology (ICFPT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129845924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Load-Store Queue Sizing for Efficient Dataflow Circuits 高效数据流电路的负载存储队列大小
2022 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2022-12-05 DOI: 10.1109/ICFPT56656.2022.9974425
Jiantao Liu, Carmine Rizzi, Lana Josipović
{"title":"Load-Store Queue Sizing for Efficient Dataflow Circuits","authors":"Jiantao Liu, Carmine Rizzi, Lana Josipović","doi":"10.1109/ICFPT56656.2022.9974425","DOIUrl":"https://doi.org/10.1109/ICFPT56656.2022.9974425","url":null,"abstract":"Dataflow circuits implement dynamic scheduling and have recently been explored as an alternative to standard, statically scheduled high-level synthesis (HLS) solutions. In contrast to static HLS, dataflow circuits resolve memory dependencies during runtime by employing load-store queues (LSQs) at the memory interface. However, LSQs are extremely resource-expensive to implement in a spatial system and may cause notable frequency degradation. Therefore, there is a clear need to minimize their size and complexity, while still allowing the circuit to achieve a high computational rate. So far, designers resorted to manually tuning the LSQ depth (i.e., number of queue entries) to trade off area and performance; yet, this approach is evidently time-consuming and unfeasible for complex designs. In this work, we develop a strategy to automatically determine the most affordable LSQ depths in dataflow circuits while maintaining the best possible circuit throughput. We demonstrate our technique on benchmarks obtained from C code with different memory access patterns and show that it can effectively produce the desired Pareto-optimal design points.","PeriodicalId":239314,"journal":{"name":"2022 International Conference on Field-Programmable Technology (ICFPT)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133601743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
FPGA Implementation of Low-Latency Recursive Median Filter 低延迟递归中值滤波器的FPGA实现
2022 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2022-12-05 DOI: 10.1109/ICFPT56656.2022.9974273
Bo Peng, Yuzhu Zhou, Qiang Li, Maosong Lin, Jiankui Weng, Qiang Zeng
{"title":"FPGA Implementation of Low-Latency Recursive Median Filter","authors":"Bo Peng, Yuzhu Zhou, Qiang Li, Maosong Lin, Jiankui Weng, Qiang Zeng","doi":"10.1109/ICFPT56656.2022.9974273","DOIUrl":"https://doi.org/10.1109/ICFPT56656.2022.9974273","url":null,"abstract":"The recursive median filter has stronger noise at-tenuation capability than the median filter, especially for high-intensity and irregularly distributed noise. However, the recursive operation prevents recursive median filter from being pipelined, which leads to the recursive median filter being not real-time enough to be widely applied. This paper presents an FPGA implementation of low-latency recursive median filter. The proposed architecture completes the median calculation of the current window and the data pre-processing of the next window in one clock cycle, thereby reducing the calculation complexity of each median. The results show that for 5x5 window, the proposed recursive median filter core operates at a maximum frequency of 334 MHz on a zynq ultrascale+ FPGA device, which meets the real-time processing requirements for Full High Definition(FHD) images.","PeriodicalId":239314,"journal":{"name":"2022 International Conference on Field-Programmable Technology (ICFPT)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133671829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信