RP-Ring: n体模拟的异构多fpga加速解决方案

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2016-05-01 DOI:10.1109/FCCM.2016.20

Tianqi Wang, Xi Jin, Bo Peng, Chuanjun Wang, Linlin Zheng

{"title":"RP-Ring: n体模拟的异构多fpga加速解决方案","authors":"Tianqi Wang, Xi Jin, Bo Peng, Chuanjun Wang, Linlin Zheng","doi":"10.1109/FCCM.2016.20","DOIUrl":null,"url":null,"abstract":"We propose an heterogeneous multi-FPGA accelerating solution, which is called as RP-ring (Reconfigurable Processor ring), for direct-summation N-body simulation. In this solution, we try to use existing FPGA boards rather than design new specialized boards to reduce cost. It can be expanded conveniently with any available FPGA board and only requires quite low communication bandwidth between FPGA boards. The communication protocol is simple and can be implemented with limited hardware/software resource. In order to prevent the slowest board from dragging the overall performance down, we build a mathematical model to decompose workload among FPGAs. The model divide workload based on the logic resource, memory access bandwidth and communication bandwidth of each FPGA chip. We apply the solution in astrodynamics simulation and achieve two orders of magnitude speedup compared with CPU implementations.","PeriodicalId":113498,"journal":{"name":"2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"RP-Ring: A Heterogeneous Multi-FPGA Accelerating Solution for N-Body Simulations\",\"authors\":\"Tianqi Wang, Xi Jin, Bo Peng, Chuanjun Wang, Linlin Zheng\",\"doi\":\"10.1109/FCCM.2016.20\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose an heterogeneous multi-FPGA accelerating solution, which is called as RP-ring (Reconfigurable Processor ring), for direct-summation N-body simulation. In this solution, we try to use existing FPGA boards rather than design new specialized boards to reduce cost. It can be expanded conveniently with any available FPGA board and only requires quite low communication bandwidth between FPGA boards. The communication protocol is simple and can be implemented with limited hardware/software resource. In order to prevent the slowest board from dragging the overall performance down, we build a mathematical model to decompose workload among FPGAs. The model divide workload based on the logic resource, memory access bandwidth and communication bandwidth of each FPGA chip. We apply the solution in astrodynamics simulation and achieve two orders of magnitude speedup compared with CPU implementations.\",\"PeriodicalId\":113498,\"journal\":{\"name\":\"2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FCCM.2016.20\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2016.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

我们提出了一种异构多fpga加速解决方案，称为rp环(可重构处理器环)，用于直接求和n体仿真。在这个解决方案中，我们尝试使用现有的FPGA板，而不是设计新的专用板来降低成本。它可以方便地扩展到任何可用的FPGA板上，并且FPGA板之间的通信带宽很低。该通信协议简单，可以在有限的硬件/软件资源下实现。为了防止最慢的电路板拖累整体性能，我们建立了一个数学模型来分解fpga之间的工作负载。该模型根据每块FPGA芯片的逻辑资源、内存访问带宽和通信带宽来划分工作负载。我们将该解决方案应用于天体动力学仿真，与CPU实现相比，速度提高了两个数量级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

RP-Ring: A Heterogeneous Multi-FPGA Accelerating Solution for N-Body Simulations

We propose an heterogeneous multi-FPGA accelerating solution, which is called as RP-ring (Reconfigurable Processor ring), for direct-summation N-body simulation. In this solution, we try to use existing FPGA boards rather than design new specialized boards to reduce cost. It can be expanded conveniently with any available FPGA board and only requires quite low communication bandwidth between FPGA boards. The communication protocol is simple and can be implemented with limited hardware/software resource. In order to prevent the slowest board from dragging the overall performance down, we build a mathematical model to decompose workload among FPGAs. The model divide workload based on the logic resource, memory access bandwidth and communication bandwidth of each FPGA chip. We apply the solution in astrodynamics simulation and achieve two orders of magnitude speedup compared with CPU implementations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

自引率

0.00%

发文量