基于fpga的全对最短路径的软硬件集成

2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Pub Date : 2006-04-24 DOI:10.1109/FCCM.2006.48

Uday Bondhugula, A. Devulapalli, James Dinan, Joseph A. Fernando, P. Wyckoff, E. Stahlberg, P. Sadayappan

{"title":"基于fpga的全对最短路径的软硬件集成","authors":"Uday Bondhugula, A. Devulapalli, James Dinan, Joseph A. Fernando, P. Wyckoff, E. Stahlberg, P. Sadayappan","doi":"10.1109/FCCM.2006.48","DOIUrl":null,"url":null,"abstract":"Field-programmable gate arrays (FPGAs) are being employed in high performance computing systems owing to their potential to accelerate a wide variety of long-running routines. Parallel FPGA-based designs often yield a very high speedup. Applications using these designs on reconfigurable supercomputers involve software on the system managing computation on the FPGA. To extract maximum performance from an FPGA design at the application level, it becomes necessary to minimize associated data movement costs on the system. We address this hardware/software integration challenge in the context of the all-pairs shortest-paths (APSP) problem in a directed graph. We employ a parallel FPGA-based design using a blocked algorithm to solve large instances of APSP. With appropriate design choices and optimizations, experimental results on the Cray XD1 show that the FPGA-based implementation sustains an application-level speedup of 15 over an optimized CPU-based implementation","PeriodicalId":123057,"journal":{"name":"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":"{\"title\":\"Hardware/Software Integration for FPGA-based All-Pairs Shortest-Paths\",\"authors\":\"Uday Bondhugula, A. Devulapalli, James Dinan, Joseph A. Fernando, P. Wyckoff, E. Stahlberg, P. Sadayappan\",\"doi\":\"10.1109/FCCM.2006.48\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Field-programmable gate arrays (FPGAs) are being employed in high performance computing systems owing to their potential to accelerate a wide variety of long-running routines. Parallel FPGA-based designs often yield a very high speedup. Applications using these designs on reconfigurable supercomputers involve software on the system managing computation on the FPGA. To extract maximum performance from an FPGA design at the application level, it becomes necessary to minimize associated data movement costs on the system. We address this hardware/software integration challenge in the context of the all-pairs shortest-paths (APSP) problem in a directed graph. We employ a parallel FPGA-based design using a blocked algorithm to solve large instances of APSP. With appropriate design choices and optimizations, experimental results on the Cray XD1 show that the FPGA-based implementation sustains an application-level speedup of 15 over an optimized CPU-based implementation\",\"PeriodicalId\":123057,\"journal\":{\"name\":\"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"27\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FCCM.2006.48\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2006.48","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 27

摘要

现场可编程门阵列(fpga)由于其加速各种长期运行程序的潜力而被应用于高性能计算系统中。基于并行fpga的设计通常产生非常高的加速。在可重构超级计算机上使用这些设计的应用程序涉及系统上的软件管理FPGA上的计算。为了在应用程序级别从FPGA设计中获得最大的性能，有必要将系统上相关的数据移动成本降至最低。我们在有向图中的全对最短路径(APSP)问题的背景下解决了这个硬件/软件集成挑战。我们采用基于并行fpga的设计，使用阻塞算法来解决大型APSP实例。通过适当的设计选择和优化，在Cray XD1上的实验结果表明，基于fpga的实现比基于cpu的优化实现保持了15%的应用级加速

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Hardware/Software Integration for FPGA-based All-Pairs Shortest-Paths

Field-programmable gate arrays (FPGAs) are being employed in high performance computing systems owing to their potential to accelerate a wide variety of long-running routines. Parallel FPGA-based designs often yield a very high speedup. Applications using these designs on reconfigurable supercomputers involve software on the system managing computation on the FPGA. To extract maximum performance from an FPGA design at the application level, it becomes necessary to minimize associated data movement costs on the system. We address this hardware/software integration challenge in the context of the all-pairs shortest-paths (APSP) problem in a directed graph. We employ a parallel FPGA-based design using a blocked algorithm to solve large instances of APSP. With appropriate design choices and optimizations, experimental results on the Cray XD1 show that the FPGA-based implementation sustains an application-level speedup of 15 over an optimized CPU-based implementation

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines

自引率

0.00%

发文量