Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays最新文献

筛选
英文 中文
Enhancements in UltraScale CLB Architecture UltraScale CLB架构的增强
Shant Chandrakar, D. Gaitonde, T. Bauer
{"title":"Enhancements in UltraScale CLB Architecture","authors":"Shant Chandrakar, D. Gaitonde, T. Bauer","doi":"10.1145/2684746.2689077","DOIUrl":"https://doi.org/10.1145/2684746.2689077","url":null,"abstract":"Each generation of FPGA architecture benefits from optimizations around its technology node and target usage. In this paper, we discuss some of the changes made to the CLB for Xilinx's 20nm UltraScale product family. We motivate those changes and demonstrate better results than previous CLB architectures on a variety of metrics. We show that, in demanding scenarios, logic placed in an UltraScale device requires 16% less wirelength than 7-series. Designs mapped to UltraScale devices also require fewer logic tiles. In this paper, we demonstrate the utilization benefits of the UltraScale CLB attributed to certain CLB enhancements. The enhancements described herein result in an average packing improvement of 3% for the example design suite. We also show that the UltraScale architecture handles aggressive, tighter packing more gracefully than previous generations of FPGA. These significant reductions in wirelength and CLB counts translate directly into power, performance and ease-of-use benefits.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129092771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
A Novel Method for FPGA Test Based on Partial Reconfiguration and Sorting Algorithm (Abstract Only) 基于部分重构和排序算法的FPGA测试新方法(摘要)
Xianjian Zheng, Fan Zhang, Lei Chen, Zhiping Wen, Yuanfu Zhao, Xuewu Li
{"title":"A Novel Method for FPGA Test Based on Partial Reconfiguration and Sorting Algorithm (Abstract Only)","authors":"Xianjian Zheng, Fan Zhang, Lei Chen, Zhiping Wen, Yuanfu Zhao, Xuewu Li","doi":"10.1145/2684746.2689123","DOIUrl":"https://doi.org/10.1145/2684746.2689123","url":null,"abstract":"The programmability of an FPGA poses a number of challenges when it comes to complete and comprehensive testing of the FPGA itself. A large number of configurations must be downloaded into the FPGA to test the programmable sources. A great many methods were proposed to reduce the number of configurations to minimize the test time, but few of papers were focus on reducing single configuration time. This paper proposes a novel method to reduce more than 30% of the total configuration time based on partial reconfiguration technology and sorting algorithm. This method is implemented on a series of SRAM-based FPGAs. The experimental result shows that this method reduces 30%-45% of the total configuration time and can be generally applied to all SRAM-based FPGAs currently.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126698553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Method for Enabling FPGA Context-Switch (Abstract Only) 一种实现FPGA上下文切换的新方法(仅摘要)
A. Bourge, O. Muller, F. Rousseau
{"title":"A Novel Method for Enabling FPGA Context-Switch (Abstract Only)","authors":"A. Bourge, O. Muller, F. Rousseau","doi":"10.1145/2684746.2689096","DOIUrl":"https://doi.org/10.1145/2684746.2689096","url":null,"abstract":"Modern FPGAs provide great computational power, flexible resources and a versatile environment. Managing to obtain the best of these three worlds is rather complicated given the actual design flows. Our work focus on enabling task multiplexing, as part of a more flexible FPGA usage. Task multiplexing in FPGAs raises indeed a lot of questions. Multiplexing the usage of a reconfigurable fabric is leading to a better utilization of its surface because it offers to share its resources not only in space (number of slices allocated to a task) but also in time (tasks are allowed in time slots). The base mechanism known as context-switch consists in removing a task after its allowed time slot has passed. The first step toward efficiently multiplex tasks in a reconfigurable fabric is to decide when this removal will have the least possible impact on the system. This poster presents our preliminary results concerning what we consider as necessary in order to enable such a feature. Our work focus on finding automatically the best instants of the task execution in order to effectively remove a running task from the FPGA, taking into account the time needed to extract a relevant context necessary to restart it later. This instant selection is performed at a high level of abstraction, enabling us to make choices with an accurate knowledge of the task nature and specificities. The second part of this poster presents the entire mechanism which makes use of the previously selected slots in order to switch between tasks.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121552146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Numerical Program Optimization for High-Level Synthesis 高级综合的数值程序优化
Xitong Gao, G. Constantinides
{"title":"Numerical Program Optimization for High-Level Synthesis","authors":"Xitong Gao, G. Constantinides","doi":"10.1145/2684746.2689090","DOIUrl":"https://doi.org/10.1145/2684746.2689090","url":null,"abstract":"This paper introduces a new technique, and its associated open source tool, SOAP2, to automatically perform source-to-source optimization of numerical programs, specifically targeting the trade-off between numerical accuracy and resource usage as a high-level synthesis flow for FPGA implementations. We introduce a new intermediate representation, which we call metasemantic intermediate representation (MIR), to enable the abstraction and optimization of numerical programs. We efficiently discover equivalent structures in MIRs by exploiting the rules of real arithmetic, such as associativity and distributivity, and rules that allow control flow restructuring, and produce Pareto frontiers of equivalent programs that trades off LUTs, DSPs and accuracy. Additionally, we further broaden the Pareto frontier in our optimization flow to automatically explore the numerical implications of partial loop unrolling and loop splitting. In real applications, our tool discovers a wide range of Pareto optimal options, and the most accurate one improves the accuracy of numerical programs by up to 65%.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122684978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
An FPGA Implementation of a Timing-Error Tolerant Discrete Cosine Transform (Abstract Only) 时序容错离散余弦变换的FPGA实现(仅摘要)
Yaoqiang Li, P. Chuang, A. Kennings, M. Sachdev
{"title":"An FPGA Implementation of a Timing-Error Tolerant Discrete Cosine Transform (Abstract Only)","authors":"Yaoqiang Li, P. Chuang, A. Kennings, M. Sachdev","doi":"10.1145/2684746.2689113","DOIUrl":"https://doi.org/10.1145/2684746.2689113","url":null,"abstract":"We present a Discrete Cosine Transform (DCT) unit embedded with Error Detection Sequential (EDS) and Dynamic Voltage Scaling (DVS) circuits to speculatively monitor its noncritical datapaths. This monitoring strategy requires no buffer insertions with only minimal modifications to the existing digital design methodology and is therefore applicable for Field-Programmable Gate Array (FPGA) implementations. The proposed design is implemented in an FPGA. The duty cycles of the constraint clock and the actual clock are differentiated to guide the synthesizer to place the EDS circuits with specific timing margin. The proposed design is tested with two classic images and is able to detect timing errors in the noncritical datapaths due to dynamic process, voltage and temperature (PVT) variations. The DVS circuit correspondingly controls a linear voltage regulator to adjust the supply voltage to the Point of First Failure (PoFF). No actual timing errors are generated, primarily because of the unique speculative characteristic of the proposed monitoring strategy. Our proposed design incurs a 0.3% logic element overhead and 3.5% maximum frequency degradation. By lowering the supply voltage by 8.3%, the proposed design saves up to 16.5% energy when operating at the same frequency as a highly optimized baseline DCT implementation.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122787162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Superoptimized Memory Subsystems for Streaming Applications 流应用的超优化内存子系统
Joseph G. Wingbermuehle, R. Cytron, R. Chamberlain
{"title":"Superoptimized Memory Subsystems for Streaming Applications","authors":"Joseph G. Wingbermuehle, R. Cytron, R. Chamberlain","doi":"10.1145/2684746.2689069","DOIUrl":"https://doi.org/10.1145/2684746.2689069","url":null,"abstract":"Because main memory is many times slower than modern processor cores, deep, multi-level cache hierarchies are ubiquitous in computers today. Similarly, applications deployed on ASICs and FPGAs are often hindered by slow external memories. Therefore, to achieve good performance, hardware designers must optimize main memory usage. Unfortunately, this process is often labor intensive and fails to explore the full range of potential memory designs. To address this issue for applications expressed in a streaming manner, we show that it is possible to generate automatically a superoptimized memory subsystem that can be deployed on an FPGA such that it performs better than a general-purpose memory subsystem. Rather than explore only simple memory subsystems, our superoptimizer is capable of exploring extremely complex designs consisting of multi-level caches and other components. Finally, we show that it is possible to deploy applications with superoptimized memory subsystems with minimal additional effort while achieving significant performance improvements over a naive memory subsystem.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"177 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127925207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
FiT: An Automated Toolkit for Matching Processor Architecture to Applications (Abstract Only) FiT:将处理器架构与应用程序匹配的自动化工具包(仅摘要)
C. Mutigwe, J. Kinyua, F. Aghdasi
{"title":"FiT: An Automated Toolkit for Matching Processor Architecture to Applications (Abstract Only)","authors":"C. Mutigwe, J. Kinyua, F. Aghdasi","doi":"10.1145/2684746.2689117","DOIUrl":"https://doi.org/10.1145/2684746.2689117","url":null,"abstract":"As the complexity of designing electronic systems continues to grow, the most commonly used solution has been to move the design process to higher levels of abstraction via software tools. In this work we present one such tool that can be used to automatically generate custom processors and systems-on-chip (SoC) from C source code or application binary files, with no requirement for the user to understand any of the underlying hardware systems. This tool also does not call for the application to be profiled for any 'hot spots' as a prerequisite for generating the custom processor. We use the toolkit to generate two types of custom processors; the area-optimized processors and the performance-optimized processors. We study the resource utilization of the custom processors and compare them with those predicted by the core density model. We find that the performance-optimized processor results are as predicted by the core density model.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123262239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Formal Verification ATPG Search Engine Emulator (Abstract Only) 正式验证ATPG搜索引擎仿真器(仅摘要)
G. Ford, A. Krishna, J. Abraham, D. Saab
{"title":"Formal Verification ATPG Search Engine Emulator (Abstract Only)","authors":"G. Ford, A. Krishna, J. Abraham, D. Saab","doi":"10.1145/2684746.2689105","DOIUrl":"https://doi.org/10.1145/2684746.2689105","url":null,"abstract":"Bounded Model Checking (BMC), as a formal method of verifying VLSI circuits, shows violation of a given circuit property by finding a counter-example to the property along bounded state paths of the circuit. In this paper, we present an emulation framework for Automatic Test Pattern Generation (ATPG)-BMC model capable of checking properties on gate-level design. In our approach, counterpart to a property is mapped into a structural monitor with one output. A target fault is then injected at the monitor output, and a modified ATPG-based state justification algorithm is used to find a test for this fault which corresponds to formally establishing the property. In this paper, emulating the process of ATPG-based BMC on reconfigurable hardware is presented. The ATPG-BMC emulator achieves a speed-up over software based methods, due to the fine-grain massive parallelism inherent to hardware. As circuit sizes approach limits of even ATPG-based method feasibility, further solutions are required. In this presentation, we propose an ATPG-based algorithm for formal verification implementation on reconfigurable hardware (FPGA). This implementation is shown to have a linear relationship between the size of the circuit being verified and FPGA resource utilization. This implies a reasonable bound on the size of the implementation, as opposed to an exponential utilization explosion as circuit size increases. This method has also been shown to be 3 orders of magnitude faster than a similar software-based approach, based on the time for solving a given ATPG problem. At the same time, though, total runtime for the FPGA emulation based implementation is significantly limited by the parts of its process still in software. Further enhancement is proposed to reduce this overhead and increase the benefit over software solvers.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121736395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Evening Panel 会议细节:晚间小组讨论
J. Lockwood
{"title":"Session details: Evening Panel","authors":"J. Lockwood","doi":"10.1145/3251654","DOIUrl":"https://doi.org/10.1145/3251654","url":null,"abstract":"","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125345945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0.5-V Highly Power-Efficient Programmable Logic using Nonvolatile Configuration Switch in BEOL 在BEOL中使用非易失性配置开关的0.5 v高效可编程逻辑
M. Miyamura, T. Sakamoto, Y. Tsuji, M. Tada, N. Banno, K. Okamoto, N. Iguchi, H. Hada
{"title":"0.5-V Highly Power-Efficient Programmable Logic using Nonvolatile Configuration Switch in BEOL","authors":"M. Miyamura, T. Sakamoto, Y. Tsuji, M. Tada, N. Banno, K. Okamoto, N. Iguchi, H. Hada","doi":"10.1145/2684746.2689088","DOIUrl":"https://doi.org/10.1145/2684746.2689088","url":null,"abstract":"A low-power nonvolatile programmable-logic cell array is proposed for energy-constrained applications such as wireless sensor nodes and mobile apparatuses. A 64x64 programmable-logic cell array includes a 9.2-Mbit nonvolatile switch, namely atom switch, as the routing switch and configuration memory. A 16-bit arithmetic logic unit, which is a building block of the micro-controller unit, was implemented to compare the speed and power consumption with a state-of-the-art low power field-programmable gate array. The proposed programmable-logic array exhibited 30% dynamic power saving and x2.5 faster operation in the low-voltage region. Zero sleep power was also demonstrated.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130880793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信