GPU-Accelerated Wire-Length Estimation for FPGA Placement

2011 Symposium on Application Accelerators in High-Performance Computing Pub Date : 2011-07-19 DOI:10.1109/SAAHPC.2011.16

C. Fobel, G. Grewal, D. Stacey

{"title":"GPU-Accelerated Wire-Length Estimation for FPGA Placement","authors":"C. Fobel, G. Grewal, D. Stacey","doi":"10.1109/SAAHPC.2011.16","DOIUrl":null,"url":null,"abstract":"In the FPGA design flow, placement remains one of the most time-consuming stages, and is also crucial in terms of quality of result. HPWL and Star+ are widely used as cost metrics in FPGA placement for estimating the total wire-length of a candidate placement prior to routing. However, both wire-length models are expensive to compute requiring O(nm) time, where n is the number of nets and m is the average net cardinality. This paper proposes using the massively multi-threaded architecture provided by GPUs to reduce the time required to compute HPWL and Star+. First, a specialized set of data structures is developed for storing net-connectivity information on the GPU. Next, a study is performed to determine how to best map the data structures onto the GPU to exploit the heterogeneous memories and thread-level parallelism that are available. Finally, a study is performed to determine what effect circuit size and net cardinality have on the speedups that can be achieved. Overall, the results show that speedups of as much as 160x over a serial CPU implementation can be achieved for both models when tested using standard benchmarks.","PeriodicalId":331604,"journal":{"name":"2011 Symposium on Application Accelerators in High-Performance Computing","volume":"80 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Symposium on Application Accelerators in High-Performance Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAAHPC.2011.16","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

In the FPGA design flow, placement remains one of the most time-consuming stages, and is also crucial in terms of quality of result. HPWL and Star+ are widely used as cost metrics in FPGA placement for estimating the total wire-length of a candidate placement prior to routing. However, both wire-length models are expensive to compute requiring O(nm) time, where n is the number of nets and m is the average net cardinality. This paper proposes using the massively multi-threaded architecture provided by GPUs to reduce the time required to compute HPWL and Star+. First, a specialized set of data structures is developed for storing net-connectivity information on the GPU. Next, a study is performed to determine how to best map the data structures onto the GPU to exploit the heterogeneous memories and thread-level parallelism that are available. Finally, a study is performed to determine what effect circuit size and net cardinality have on the speedups that can be achieved. Overall, the results show that speedups of as much as 160x over a serial CPU implementation can be achieved for both models when tested using standard benchmarks.

查看原文本刊更多论文

FPGA放置的gpu加速线长估计

在FPGA设计流程中，放置仍然是最耗时的阶段之一，并且在结果质量方面也至关重要。HPWL和Star+被广泛用作FPGA放置的成本指标，用于在路由之前估计候选放置的总线长。然而，两种线长模型的计算都很昂贵，需要O(nm)时间，其中n是网络的数量，m是平均网络基数。本文提出利用gpu提供的大规模多线程架构来减少计算HPWL和Star+所需的时间。首先，开发了一套专门的数据结构，用于在GPU上存储网络连接信息。接下来，进行一项研究以确定如何最好地将数据结构映射到GPU上，以利用可用的异构内存和线程级并行性。最后，进行了一项研究，以确定电路大小和净基数对可以实现的加速有什么影响。总的来说，结果表明，当使用标准基准测试时，两种型号的速度都可以比串行CPU实现提高160倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 Symposium on Application Accelerators in High-Performance Computing

自引率

0.00%

发文量