A scalable, serially-equivalent, high-quality parallel placement methodology suitable for modern multicore and GPU architectures

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI:10.1109/FPL.2014.6927481

C. Fobel, G. Grewal, D. Stacey

{"title":"A scalable, serially-equivalent, high-quality parallel placement methodology suitable for modern multicore and GPU architectures","authors":"C. Fobel, G. Grewal, D. Stacey","doi":"10.1109/FPL.2014.6927481","DOIUrl":null,"url":null,"abstract":"Placement and routing run-times continue to dominate the automated FPGA design flow. As the size of FPGA architectures continue to grow exponentially, it remains critical to develop parallel tools for FPGA design where the amount of exposed concurrent work scales with the size of the designs to be synthesized. In this paper, we propose a novel algorithm for parallel placement, based on simulated annealing, where the amount of parallel work directly scales with the size of the net-list to be placed. Our approach concurrently evaluates and conditionally applies very large sets of non-conflicting swaps using common parallel computing primitives, including stream compaction, category reduction, and sort. While our design is suitable for targeting all modern parallel computing platforms, we present results from our implementation which targets NVIDIA's CUDA platform, where we achieve a mean speed-up of 19x over VPR with post-routing critical-path-delay and wire-length quality that matches or exceeds VPR. We believe that this work is an important step towards the development of a scalable, high-quality placement tool.","PeriodicalId":172795,"journal":{"name":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPL.2014.6927481","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

Abstract

Placement and routing run-times continue to dominate the automated FPGA design flow. As the size of FPGA architectures continue to grow exponentially, it remains critical to develop parallel tools for FPGA design where the amount of exposed concurrent work scales with the size of the designs to be synthesized. In this paper, we propose a novel algorithm for parallel placement, based on simulated annealing, where the amount of parallel work directly scales with the size of the net-list to be placed. Our approach concurrently evaluates and conditionally applies very large sets of non-conflicting swaps using common parallel computing primitives, including stream compaction, category reduction, and sort. While our design is suitable for targeting all modern parallel computing platforms, we present results from our implementation which targets NVIDIA's CUDA platform, where we achieve a mean speed-up of 19x over VPR with post-routing critical-path-delay and wire-length quality that matches or exceeds VPR. We believe that this work is an important step towards the development of a scalable, high-quality placement tool.

查看原文本刊更多论文

适合现代多核和GPU架构的可扩展、串行等效、高质量并行放置方法

放置和路由运行时继续主导自动化FPGA设计流程。随着FPGA架构的规模呈指数级增长，开发用于FPGA设计的并行工具仍然至关重要，其中暴露的并发工作数量随着要合成的设计的规模而扩大。在本文中，我们提出了一种新的并行放置算法，基于模拟退火，其中并行工作的数量直接与要放置的网络列表的大小成比例。我们的方法使用常见的并行计算原语(包括流压缩、类别缩减和排序)并发地评估和有条件地应用非常大的无冲突交换集。虽然我们的设计适用于所有现代并行计算平台，但我们提出了针对NVIDIA CUDA平台的实现结果，我们在VPR上实现了19倍的平均加速，路由后关键路径延迟和线长质量匹配或超过VPR。我们相信这项工作是朝着开发可扩展的高质量放置工具迈出的重要一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 24th International Conference on Field Programmable Logic and Applications (FPL)

自引率

0.00%

发文量