AIgean: An Open Framework for Deploying Machine Learning on Heterogeneous Clusters

ACM Transactions on Reconfigurable Technology and Systems (TRETS) Pub Date : 2021-12-28 DOI:10.1145/3482854

Naif Tarafdar, G. Di Guglielmo, P. Harris, J. Krupa, V. Loncar, D. Rankin, Nhan Tran, Zhenbin Wu, Q. Shen, P. Chow

{"title":"AIgean: An Open Framework for Deploying Machine Learning on Heterogeneous Clusters","authors":"Naif Tarafdar, G. Di Guglielmo, P. Harris, J. Krupa, V. Loncar, D. Rankin, Nhan Tran, Zhenbin Wu, Q. Shen, P. Chow","doi":"10.1145/3482854","DOIUrl":null,"url":null,"abstract":"AIgean, pronounced like the sea, is an open framework to build and deploy machine learning (ML) algorithms on a heterogeneous cluster of devices (CPUs and FPGAs). We leverage two open source projects: Galapagos, for multi-FPGA deployment, and hls4ml, for generating ML kernels synthesizable using Vivado HLS. AIgean provides a full end-to-end multi-FPGA/CPU implementation of a neural network. The user supplies a high-level neural network description, and our tool flow is responsible for the synthesizing of the individual layers, partitioning layers across different nodes, as well as the bridging and routing required for these layers to communicate. If the user is an expert in a particular domain and would like to tinker with the implementation details of the neural network, we define a flexible implementation stack for ML that includes the layers of Algorithms, Cluster Deployment & Communication, and Hardware. This allows the user to modify specific layers of abstraction without having to worry about components outside of their area of expertise, highlighting the modularity of AIgean. We demonstrate the effectiveness of AIgean with two use cases: an autoencoder, and ResNet-50 running across 10 and 12 FPGAs. AIgean leverages the FPGA’s strength in low-latency computing, as our implementations target batch-1 implementations.","PeriodicalId":162787,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3482854","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

AIgean, pronounced like the sea, is an open framework to build and deploy machine learning (ML) algorithms on a heterogeneous cluster of devices (CPUs and FPGAs). We leverage two open source projects: Galapagos, for multi-FPGA deployment, and hls4ml, for generating ML kernels synthesizable using Vivado HLS. AIgean provides a full end-to-end multi-FPGA/CPU implementation of a neural network. The user supplies a high-level neural network description, and our tool flow is responsible for the synthesizing of the individual layers, partitioning layers across different nodes, as well as the bridging and routing required for these layers to communicate. If the user is an expert in a particular domain and would like to tinker with the implementation details of the neural network, we define a flexible implementation stack for ML that includes the layers of Algorithms, Cluster Deployment & Communication, and Hardware. This allows the user to modify specific layers of abstraction without having to worry about components outside of their area of expertise, highlighting the modularity of AIgean. We demonstrate the effectiveness of AIgean with two use cases: an autoencoder, and ResNet-50 running across 10 and 12 FPGAs. AIgean leverages the FPGA’s strength in low-latency computing, as our implementations target batch-1 implementations.

查看原文本刊更多论文

在异构集群上部署机器学习的开放框架

AIgean是一个开放的框架，用于在异构设备集群(cpu和fpga)上构建和部署机器学习(ML)算法。我们利用了两个开源项目:用于多fpga部署的Galapagos和用于生成可使用Vivado HLS合成的ML内核的hls4ml。AIgean提供了一个完整的端到端多fpga /CPU神经网络实现。用户提供高级神经网络描述，我们的工具流负责综合各个层，跨不同节点划分层，以及这些层通信所需的桥接和路由。如果用户是特定领域的专家，并且想要修补神经网络的实现细节，我们为ML定义了一个灵活的实现堆栈，包括算法层，集群部署和通信层以及硬件层。这允许用户修改特定的抽象层，而不必担心他们专业领域之外的组件，突出了AIgean的模块化。我们通过两个用例展示了AIgean的有效性:一个自动编码器，以及在10和12个fpga上运行的ResNet-50。AIgean利用FPGA在低延迟计算方面的优势，因为我们的实现目标是批处理1实现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Reconfigurable Technology and Systems (TRETS)

自引率

0.00%

发文量