基于fpga的全对最短路径的软硬件集成

Uday Bondhugula, A. Devulapalli, James Dinan, Joseph A. Fernando, P. Wyckoff, E. Stahlberg, P. Sadayappan
{"title":"基于fpga的全对最短路径的软硬件集成","authors":"Uday Bondhugula, A. Devulapalli, James Dinan, Joseph A. Fernando, P. Wyckoff, E. Stahlberg, P. Sadayappan","doi":"10.1109/FCCM.2006.48","DOIUrl":null,"url":null,"abstract":"Field-programmable gate arrays (FPGAs) are being employed in high performance computing systems owing to their potential to accelerate a wide variety of long-running routines. Parallel FPGA-based designs often yield a very high speedup. Applications using these designs on reconfigurable supercomputers involve software on the system managing computation on the FPGA. To extract maximum performance from an FPGA design at the application level, it becomes necessary to minimize associated data movement costs on the system. We address this hardware/software integration challenge in the context of the all-pairs shortest-paths (APSP) problem in a directed graph. We employ a parallel FPGA-based design using a blocked algorithm to solve large instances of APSP. With appropriate design choices and optimizations, experimental results on the Cray XD1 show that the FPGA-based implementation sustains an application-level speedup of 15 over an optimized CPU-based implementation","PeriodicalId":123057,"journal":{"name":"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":"{\"title\":\"Hardware/Software Integration for FPGA-based All-Pairs Shortest-Paths\",\"authors\":\"Uday Bondhugula, A. Devulapalli, James Dinan, Joseph A. Fernando, P. Wyckoff, E. Stahlberg, P. Sadayappan\",\"doi\":\"10.1109/FCCM.2006.48\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Field-programmable gate arrays (FPGAs) are being employed in high performance computing systems owing to their potential to accelerate a wide variety of long-running routines. Parallel FPGA-based designs often yield a very high speedup. Applications using these designs on reconfigurable supercomputers involve software on the system managing computation on the FPGA. To extract maximum performance from an FPGA design at the application level, it becomes necessary to minimize associated data movement costs on the system. We address this hardware/software integration challenge in the context of the all-pairs shortest-paths (APSP) problem in a directed graph. We employ a parallel FPGA-based design using a blocked algorithm to solve large instances of APSP. With appropriate design choices and optimizations, experimental results on the Cray XD1 show that the FPGA-based implementation sustains an application-level speedup of 15 over an optimized CPU-based implementation\",\"PeriodicalId\":123057,\"journal\":{\"name\":\"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"27\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FCCM.2006.48\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2006.48","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

摘要

现场可编程门阵列(fpga)由于其加速各种长期运行程序的潜力而被应用于高性能计算系统中。基于并行fpga的设计通常产生非常高的加速。在可重构超级计算机上使用这些设计的应用程序涉及系统上的软件管理FPGA上的计算。为了在应用程序级别从FPGA设计中获得最大的性能,有必要将系统上相关的数据移动成本降至最低。我们在有向图中的全对最短路径(APSP)问题的背景下解决了这个硬件/软件集成挑战。我们采用基于并行fpga的设计,使用阻塞算法来解决大型APSP实例。通过适当的设计选择和优化,在Cray XD1上的实验结果表明,基于fpga的实现比基于cpu的优化实现保持了15%的应用级加速
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Hardware/Software Integration for FPGA-based All-Pairs Shortest-Paths
Field-programmable gate arrays (FPGAs) are being employed in high performance computing systems owing to their potential to accelerate a wide variety of long-running routines. Parallel FPGA-based designs often yield a very high speedup. Applications using these designs on reconfigurable supercomputers involve software on the system managing computation on the FPGA. To extract maximum performance from an FPGA design at the application level, it becomes necessary to minimize associated data movement costs on the system. We address this hardware/software integration challenge in the context of the all-pairs shortest-paths (APSP) problem in a directed graph. We employ a parallel FPGA-based design using a blocked algorithm to solve large instances of APSP. With appropriate design choices and optimizations, experimental results on the Cray XD1 show that the FPGA-based implementation sustains an application-level speedup of 15 over an optimized CPU-based implementation
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信