FPGA. ACM International Symposium on Field-Programmable Gate Arrays最新文献_第2页

Embedding-based placement of processing element networks on FPGAs for physical model simulation 基于嵌入的fpga物理模型仿真处理单元网络布局

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435297

Bailey Miller, F. Vahid, T. Givargis

{"title":"Embedding-based placement of processing element networks on FPGAs for physical model simulation","authors":"Bailey Miller, F. Vahid, T. Givargis","doi":"10.1145/2435264.2435297","DOIUrl":"https://doi.org/10.1145/2435264.2435297","url":null,"abstract":"Physical models utilize mathematical equations to model physical systems like airway mechanics, neuron networks, or chemical reactions. Previous work has shown that physical models can execute fast on FPGAs (field-programmable gate arrays). We introduce an approach for implementing physical models on FPGAs that applies graph theoretic techniques to make use of a physical model's natural structure--tree, ring, chain, etc.--resulting in model execution speedups. A first phase of the approach maps physical model equations to a structured virtual PE (processing element) graph using graph theoretic folding techniques. A second phase maps the structured virtual PE graph to physical PE regions on an FPGA using graph embedding theory. We also present a simulated annealing approach with custom cost and neighbor functions that can map any physical model onto an FPGA with low wire costs. Average circuit speedup improvements over previous works for various physical models are 65% using the graph embedding and 35% using the simulated annealing approach. Each approach's more efficient use of FPGA resources also enables larger models to be implemented on an FPGA device.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"15 2","pages":"181-190"},"PeriodicalIF":0.0,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91438105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A latency-optimized hybrid network for clustering FPGAs (abstract only) 用于fpga集群的延迟优化混合网络(仅摘要)

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435312

Trevor Bunker, S. Swanson

引用次数: 0

Performance and toolchain of a combined GPU/FPGA desktop (abstract only) GPU/FPGA组合桌面的性能和工具链(仅摘要)

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435336

B. Silva, An Braeken, E. D'Hollander, A. Touhafi, Jan G. Cornelis, J. Lemeire

{"title":"Performance and toolchain of a combined GPU/FPGA desktop (abstract only)","authors":"B. Silva, An Braeken, E. D'Hollander, A. Touhafi, Jan G. Cornelis, J. Lemeire","doi":"10.1145/2435264.2435336","DOIUrl":"https://doi.org/10.1145/2435264.2435336","url":null,"abstract":"Low-power, high-performance computing nowadays relies on accelerator cards to speed up the calculations. Combining the power of GPUs with the flexibility of FPGAs enlarges the scope of problems that can be accelerated [2, 3]. We describe the performance analysis of a desktop equipped with a GPU Tesla 2050 and an FPGA Virtex-6 LX240T. First, the balance between the I/O and the raw peak performance is depicted using the roofline model [4]. Next, the performance of a number of image processing algorithms is measured and the results are mapped onto the roofline graph. This allows to compare the GPU and the FPGA and also to optimize the algorithms for both accelerators. A programming toolchain is implemented, consisting of OpenCL for the GPU and several High-Level Synthesis compilers for the FPGA. Our results show that the HLS compilers outperform handwritten code and offer a performance comparable to the GPU. In addition the FPGA compilers reduce the development time by an order of magnitude, at the expense of an increased resource consumption. The roofline model also shows that both accelerators are equally limited by the input/output bandwidth to the host. A well-tuned accelerator-based codesign, identifying the parallelism, the computation and data patterns of different classes of algorithms, will enable to maximize the performance of the combined GPU/FPGA system [1].","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"80 1","pages":"274"},"PeriodicalIF":0.0,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72774431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Sensing nanosecond-scale voltage attacks and natural transients in FPGAs 感知纳秒级电压攻击和fpga的自然瞬变

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435283

K. Zick, Meeta Srivastav, Wei Zhang, M. French

引用次数: 94

Minimum energy operation for clustered island-style FPGAs 簇式岛式fpga的最小能量操作

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435293

P. Grossmann, M. Leeser, M. Onabajo

引用次数: 5

FPGA meta-data management system for accelerating implementation time with incremental compilation (abstract only) 增量编译加速FPGA元数据管理系统(仅摘要)

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435323

A. Love, P. Athanas

引用次数: 1

AutoMapper: an automated tool for optimal hardware resource allocation for networking applications on FPGA (abstract only) AutoMapper:为FPGA上的网络应用提供最佳硬件资源分配的自动化工具(仅抽象)

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435335

Swapnil Haria, V. Prasanna

{"title":"AutoMapper: an automated tool for optimal hardware resource allocation for networking applications on FPGA (abstract only)","authors":"Swapnil Haria, V. Prasanna","doi":"10.1145/2435264.2435335","DOIUrl":"https://doi.org/10.1145/2435264.2435335","url":null,"abstract":"It has now become imperative for routers to support complicated lookup schemes, based on the specific function of the networking hardware. It is no longer possible to ensure an optimal resource utilization using manual organization techniques due to the increasing complexity of lookup schemes, as well as the large number of potential implementation choices. We have developed an automated tool, AutoMapper, which can map lookup schemes onto a particular target architecture optimally, thereby providing a superior alternative to the time-consuming and resource inefficient technique of manual conversion. It is based on an Integer Linear Programming (ILP) formulation that is able to allocate the limited hardware resources for a single lookup scheme, while optimizing any of the three performance metrics of latency, throughput or power consumption. Accurate formulation of the objective function and the constraint equations guarantee optimality in terms of the chosen performance metric. We demonstrate the operation of the developed tool, by successfully mapping complex real world lookup schemes onto a state-of-the art FPGA device, with execution times being under a second on a dual-core computer with 4 GB of RAM, running at 2.40 GHz.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"60 1","pages":"274"},"PeriodicalIF":0.0,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90637573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Are FPGAs suffering from the innovator's dilemna? fpga是否陷入了创新者的困境?

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435289

Vaughn Betz, J. Cong

{"title":"Are FPGAs suffering from the innovator's dilemna?","authors":"Vaughn Betz, J. Cong","doi":"10.1145/2435264.2435289","DOIUrl":"https://doi.org/10.1145/2435264.2435289","url":null,"abstract":"FPGAs constitute a highly profitable industry, with approximately $5 billion of sales per year. High barriers to entry keep most companies away, and enable high profit margins for the incumbents. The industry has grown greatly over the years, but still constitutes a small portion of the overall semiconductor market. This raises the question to be addressed by this panel: is the FPGA community innovating as much as it should, or is a bias to maintain high profit margins and protect the cash flow of the current FPGA market holding us back from exploring new ideas and products that could greatly expand the appeal of and market for FPGA-related technology? This would be a classic case of the innovator's dilemma defined by Clayton Christensen: it is difficult for a company to engage in creative destruction of a cash cow product.\u0000 Our distinguished panel of experts will discuss whether we are seeing major innovation in architectures, design flows and applications, or only incremental improvements. We will also discuss if any new (possibly low margin) application domain is left out by the FPGA industry, what radical ideas should be explored, and whether large incumbents, new startups, academia or some combination are best able to attack these new areas.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"42 1","pages":"135-136"},"PeriodicalIF":0.0,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85548962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Circuit optimizations to minimize energy in the global interconnect of a low-power--FPGA (abstract only) 电路优化以最小化低功耗FPGA的全局互连中的能量(仅摘要)

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435341

Oluseyi A. Ayorinde, B. Calhoun

引用次数: 0

Defect recovery in nanodevice-based programmable interconnects (abstract only) 基于纳米器件的可编程互连的缺陷修复(仅摘要)

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435343

J. Cong, Bingjun Xiao

{"title":"Defect recovery in nanodevice-based programmable interconnects (abstract only)","authors":"J. Cong, Bingjun Xiao","doi":"10.1145/2435264.2435343","DOIUrl":"https://doi.org/10.1145/2435264.2435343","url":null,"abstract":"This work focuses on defect tolerance for nanodevice-based programmable interconnects of FPGAs. A single nanodevice can function as a routing switch in place of a pass transistor and its six-transistor SRAM cell in conventional FPGAs. Defects of nanodevices in programmable interconnects are manifested as losses of configurability and can be categorized into stuck- open defect and stuck- closed defect. First, we show that the stuck-closed defects of nanodevices have a much higher impact than the stuck-open defects. Instead of simply avoiding the stuck-closed defects, we recover them by treating them as shorting constraints in the routing. We develop a scalable algorithm to perform timing-driven routing under these extra constraints. We extend the idea of the resource negotiation to balance the goals of timing and routability under shorting constraints. We also develop several techniques to guide the router to map the shorting clusters to those nets with more shared paths for better utilization of routing resources while automatically balancing it with circuit performance. We also enhance the placement algorithm to recover logic blocks which become virtually unusable due to shorted pins. Simulation results show that at the up-to-date level of nanodevice defects (108-1011x higher than CMOS), compared to the simple avoidance method, our approach reduces the degradation of resource usage by 87%, improves the routability by 37%, and reduce the degradation of circuit performance by 36%, at a negligible overhead of tool runtime.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"2 1","pages":"277-278"},"PeriodicalIF":0.0,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90214192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0