基于fpga的可扩展多处理器

2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Pub Date : 2006-04-24 DOI:10.1109/FCCM.2006.17

A. Patel, Christopher A. Madill, Manuel Saldaña, C. Comis, R. Pomès, P. Chow

{"title":"基于fpga的可扩展多处理器","authors":"A. Patel, Christopher A. Madill, Manuel Saldaña, C. Comis, R. Pomès, P. Chow","doi":"10.1109/FCCM.2006.17","DOIUrl":null,"url":null,"abstract":"It has been shown that a small number of FPGAs can significantly accelerate certain computing tasks by up to two or three orders of magnitude. However, particularly intensive large-scale computing applications, such as molecular dynamics simulations of biological systems, underscore the need for even greater speedups to address relevant length and time scales. In this work, we propose an architecture for a scalable computing machine built entirely using FPGA computing nodes. The machine enables designers to implement large-scale computing applications using a heterogeneous combination of hardware accelerators and embedded microprocessors spread across many FPGAs, all interconnected by a flexible communication network. Parallelism at multiple levels of granularity within an application can be exploited to obtain the maximum computational throughput. By focusing on applications that exhibit a high computation-to-communication ratio, we narrow the extent of this investigation to the development of a suitable communication infrastructure for our machine, as well as an appropriate programming model and design flow for implementing applications. By providing a simple, abstracted communication interface with the objective of being able to scale to thousands of FPGA nodes, the proposed architecture appears to the programmer as a unified, extensible FPGA fabric. A programming model based on the MPI message-passing standard is also presented as a means for partitioning an application into independent computing tasks that can be implemented on our architecture. Finally, we demonstrate the first use of our design flow by developing a simple molecular dynamics simulation application for the proposed machine, which runs on a small platform of development boards","PeriodicalId":123057,"journal":{"name":"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"64","resultStr":"{\"title\":\"A Scalable FPGA-based Multiprocessor\",\"authors\":\"A. Patel, Christopher A. Madill, Manuel Saldaña, C. Comis, R. Pomès, P. Chow\",\"doi\":\"10.1109/FCCM.2006.17\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It has been shown that a small number of FPGAs can significantly accelerate certain computing tasks by up to two or three orders of magnitude. However, particularly intensive large-scale computing applications, such as molecular dynamics simulations of biological systems, underscore the need for even greater speedups to address relevant length and time scales. In this work, we propose an architecture for a scalable computing machine built entirely using FPGA computing nodes. The machine enables designers to implement large-scale computing applications using a heterogeneous combination of hardware accelerators and embedded microprocessors spread across many FPGAs, all interconnected by a flexible communication network. Parallelism at multiple levels of granularity within an application can be exploited to obtain the maximum computational throughput. By focusing on applications that exhibit a high computation-to-communication ratio, we narrow the extent of this investigation to the development of a suitable communication infrastructure for our machine, as well as an appropriate programming model and design flow for implementing applications. By providing a simple, abstracted communication interface with the objective of being able to scale to thousands of FPGA nodes, the proposed architecture appears to the programmer as a unified, extensible FPGA fabric. A programming model based on the MPI message-passing standard is also presented as a means for partitioning an application into independent computing tasks that can be implemented on our architecture. Finally, we demonstrate the first use of our design flow by developing a simple molecular dynamics simulation application for the proposed machine, which runs on a small platform of development boards\",\"PeriodicalId\":123057,\"journal\":{\"name\":\"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"64\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FCCM.2006.17\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2006.17","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 64

摘要

研究表明，少量fpga可以显著加快某些计算任务的速度，最高可达两到三个数量级。然而，特别密集的大规模计算应用，如生物系统的分子动力学模拟，强调需要更大的加速来解决相关的长度和时间尺度。在这项工作中，我们提出了一个完全使用FPGA计算节点构建的可扩展计算机器的架构。该机器使设计人员能够使用分布在许多fpga上的硬件加速器和嵌入式微处理器的异构组合来实现大规模计算应用程序，所有这些都通过灵活的通信网络相互连接。可以利用应用程序中多个粒度级别的并行性来获得最大的计算吞吐量。通过关注表现出高计算与通信比率的应用程序，我们将调查的范围缩小到为我们的机器开发合适的通信基础设施，以及用于实现应用程序的适当编程模型和设计流程。通过提供一个简单、抽象的通信接口，目标是能够扩展到数千个FPGA节点，所提出的体系结构对程序员来说是一个统一的、可扩展的FPGA结构。本文还提出了一种基于MPI消息传递标准的编程模型，作为将应用程序划分为可在我们的体系结构上实现的独立计算任务的方法。最后，我们通过为提议的机器开发一个简单的分子动力学模拟应用程序来演示我们的设计流程的首次使用，该应用程序运行在一个小型开发板平台上

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Scalable FPGA-based Multiprocessor

It has been shown that a small number of FPGAs can significantly accelerate certain computing tasks by up to two or three orders of magnitude. However, particularly intensive large-scale computing applications, such as molecular dynamics simulations of biological systems, underscore the need for even greater speedups to address relevant length and time scales. In this work, we propose an architecture for a scalable computing machine built entirely using FPGA computing nodes. The machine enables designers to implement large-scale computing applications using a heterogeneous combination of hardware accelerators and embedded microprocessors spread across many FPGAs, all interconnected by a flexible communication network. Parallelism at multiple levels of granularity within an application can be exploited to obtain the maximum computational throughput. By focusing on applications that exhibit a high computation-to-communication ratio, we narrow the extent of this investigation to the development of a suitable communication infrastructure for our machine, as well as an appropriate programming model and design flow for implementing applications. By providing a simple, abstracted communication interface with the objective of being able to scale to thousands of FPGA nodes, the proposed architecture appears to the programmer as a unified, extensible FPGA fabric. A programming model based on the MPI message-passing standard is also presented as a means for partitioning an application into independent computing tasks that can be implemented on our architecture. Finally, we demonstrate the first use of our design flow by developing a simple molecular dynamics simulation application for the proposed machine, which runs on a small platform of development boards

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines

自引率

0.00%

发文量