资源受限数据分析的可重构、可扩展和经济高效的异构FPGA集群方法

IF 0.7 Q4 COMPUTER SCIENCE, THEORY & METHODS

International Journal of Parallel Emergent and Distributed Systems Pub Date : 2022-06-15 DOI:10.1080/17445760.2022.2085703

Dulana Rupanetti, Hassan A. Salamy, Cheol-Hong Min, Kundan Nepal

{"title":"资源受限数据分析的可重构、可扩展和经济高效的异构FPGA集群方法","authors":"Dulana Rupanetti, Hassan A. Salamy, Cheol-Hong Min, Kundan Nepal","doi":"10.1080/17445760.2022.2085703","DOIUrl":null,"url":null,"abstract":"Field programmable gate arrays (FPGAs) have become widely prevalent in recent years as a great alternative to application-specific integrated circuits (ASIC) and as a potentially cheap alternative to expensive graphics processing units (GPUs). Introduced as a prototyping solution for ASIC, FPGAs are now widely popular in applications such as artificial intelligence (AI) and machine learning (ML) models that require processing data rapidly. As a relatively low-cost option to GPUs, FPGAs have the advantage of being reprogrammed to be used in almost any data-driven application. In this work, we propose an easily scalable and cost-effective cluster-based co-processing system using FPGAs for ML and AI applications that is easily reconfigured to the requirements of each user application. The aim is to introduce a clustering system of FPGA boards to improve the efficiency of the training component of machine learning algorithms. Our proposed configuration provides an opportunity to utilise relatively inexpensive FPGA development boards to produce a cluster without expert knowledge in VHDL, Verilog, or the system designs related to FPGA development. Consisting of two parts – a computer-based host application to control the cluster and an FPGA cluster connected through a high-speed Ethernet switch, allows the users to customise and adapt the system without much effort. The methods proposed in this paper provide the ability to utilise any FPGA board with an Ethernet port to be used as a part of the cluster and unboundedly scaled. To demonstrate the effectiveness of the proposed work, a two-part experiment to demonstrate the flexibility and portability of the proposed work – a homogeneous and heterogeneous cluster, was conducted with results compared against a desktop computer and combinations of FPGAs in two clusters. Data sets ranging from 60,000 to 14 million, including stroke prediction and covid-19, were used in conducting the experiments. Results suggest that the proposed system in this work performs close to 70% faster than a traditional computer with similar accuracy rates. GRAPHICAL ABSTRACT","PeriodicalId":45411,"journal":{"name":"International Journal of Parallel Emergent and Distributed Systems","volume":"37 1","pages":"696 - 713"},"PeriodicalIF":0.7000,"publicationDate":"2022-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Re-configurable, expandable, and cost-effective heterogeneous FPGA cluster approach for resource-constrained data analysis\",\"authors\":\"Dulana Rupanetti, Hassan A. Salamy, Cheol-Hong Min, Kundan Nepal\",\"doi\":\"10.1080/17445760.2022.2085703\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Field programmable gate arrays (FPGAs) have become widely prevalent in recent years as a great alternative to application-specific integrated circuits (ASIC) and as a potentially cheap alternative to expensive graphics processing units (GPUs). Introduced as a prototyping solution for ASIC, FPGAs are now widely popular in applications such as artificial intelligence (AI) and machine learning (ML) models that require processing data rapidly. As a relatively low-cost option to GPUs, FPGAs have the advantage of being reprogrammed to be used in almost any data-driven application. In this work, we propose an easily scalable and cost-effective cluster-based co-processing system using FPGAs for ML and AI applications that is easily reconfigured to the requirements of each user application. The aim is to introduce a clustering system of FPGA boards to improve the efficiency of the training component of machine learning algorithms. Our proposed configuration provides an opportunity to utilise relatively inexpensive FPGA development boards to produce a cluster without expert knowledge in VHDL, Verilog, or the system designs related to FPGA development. Consisting of two parts – a computer-based host application to control the cluster and an FPGA cluster connected through a high-speed Ethernet switch, allows the users to customise and adapt the system without much effort. The methods proposed in this paper provide the ability to utilise any FPGA board with an Ethernet port to be used as a part of the cluster and unboundedly scaled. To demonstrate the effectiveness of the proposed work, a two-part experiment to demonstrate the flexibility and portability of the proposed work – a homogeneous and heterogeneous cluster, was conducted with results compared against a desktop computer and combinations of FPGAs in two clusters. Data sets ranging from 60,000 to 14 million, including stroke prediction and covid-19, were used in conducting the experiments. Results suggest that the proposed system in this work performs close to 70% faster than a traditional computer with similar accuracy rates. GRAPHICAL ABSTRACT\",\"PeriodicalId\":45411,\"journal\":{\"name\":\"International Journal of Parallel Emergent and Distributed Systems\",\"volume\":\"37 1\",\"pages\":\"696 - 713\"},\"PeriodicalIF\":0.7000,\"publicationDate\":\"2022-06-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Parallel Emergent and Distributed Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/17445760.2022.2085703\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Parallel Emergent and Distributed Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/17445760.2022.2085703","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

近年来，现场可编程门阵列(fpga)作为专用集成电路(ASIC)的一个很好的替代品和昂贵的图形处理单元(gpu)的一个潜在的廉价替代品已经变得广泛流行。作为ASIC的原型解决方案，fpga现在在需要快速处理数据的人工智能(AI)和机器学习(ML)模型等应用中广泛流行。作为gpu的一种相对低成本的选择，fpga具有可重新编程的优势，可以用于几乎任何数据驱动的应用程序。在这项工作中，我们提出了一种易于扩展且具有成本效益的基于集群的协同处理系统，该系统使用fpga用于ML和AI应用程序，可以轻松地重新配置以满足每个用户应用程序的需求。目的是引入FPGA板的集群系统，以提高机器学习算法训练组件的效率。我们提出的配置提供了一个机会，利用相对便宜的FPGA开发板来生产集群，而不需要VHDL, Verilog或与FPGA开发相关的系统设计方面的专业知识。该系统由两部分组成——一个基于计算机的主机应用程序控制集群和一个通过高速以太网交换机连接的FPGA集群，允许用户无需太多努力就可以定制和适应系统。本文提出的方法提供了利用任何带有以太网端口的FPGA板作为集群的一部分和无限扩展的能力。为了证明所提出工作的有效性，进行了两部分实验，以证明所提出工作的灵活性和可移植性-同质和异构集群，并将结果与台式计算机和两个集群中的fpga组合进行了比较。实验中使用的数据集从6万到1400万不等，包括中风预测和covid-19。结果表明，在相同的准确率下，本工作中提出的系统比传统计算机快了近70%。图形抽象

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Re-configurable, expandable, and cost-effective heterogeneous FPGA cluster approach for resource-constrained data analysis

Field programmable gate arrays (FPGAs) have become widely prevalent in recent years as a great alternative to application-specific integrated circuits (ASIC) and as a potentially cheap alternative to expensive graphics processing units (GPUs). Introduced as a prototyping solution for ASIC, FPGAs are now widely popular in applications such as artificial intelligence (AI) and machine learning (ML) models that require processing data rapidly. As a relatively low-cost option to GPUs, FPGAs have the advantage of being reprogrammed to be used in almost any data-driven application. In this work, we propose an easily scalable and cost-effective cluster-based co-processing system using FPGAs for ML and AI applications that is easily reconfigured to the requirements of each user application. The aim is to introduce a clustering system of FPGA boards to improve the efficiency of the training component of machine learning algorithms. Our proposed configuration provides an opportunity to utilise relatively inexpensive FPGA development boards to produce a cluster without expert knowledge in VHDL, Verilog, or the system designs related to FPGA development. Consisting of two parts – a computer-based host application to control the cluster and an FPGA cluster connected through a high-speed Ethernet switch, allows the users to customise and adapt the system without much effort. The methods proposed in this paper provide the ability to utilise any FPGA board with an Ethernet port to be used as a part of the cluster and unboundedly scaled. To demonstrate the effectiveness of the proposed work, a two-part experiment to demonstrate the flexibility and portability of the proposed work – a homogeneous and heterogeneous cluster, was conducted with results compared against a desktop computer and combinations of FPGAs in two clusters. Data sets ranging from 60,000 to 14 million, including stroke prediction and covid-19, were used in conducting the experiments. Results suggest that the proposed system in this work performs close to 70% faster than a traditional computer with similar accuracy rates. GRAPHICAL ABSTRACT

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Parallel Emergent and Distributed Systems COMPUTER SCIENCE, THEORY & METHODS-

CiteScore

2.30

自引率

0.00%

发文量