AIgean: An Open Framework for Deploying Machine Learning on Heterogeneous Clusters

Naif Tarafdar, G. Di Guglielmo, P. Harris, J. Krupa, V. Loncar, D. Rankin, Nhan Tran, Zhenbin Wu, Q. Shen, P. Chow
{"title":"AIgean: An Open Framework for Deploying Machine Learning on Heterogeneous Clusters","authors":"Naif Tarafdar, G. Di Guglielmo, P. Harris, J. Krupa, V. Loncar, D. Rankin, Nhan Tran, Zhenbin Wu, Q. Shen, P. Chow","doi":"10.1145/3482854","DOIUrl":null,"url":null,"abstract":"AIgean, pronounced like the sea, is an open framework to build and deploy machine learning (ML) algorithms on a heterogeneous cluster of devices (CPUs and FPGAs). We leverage two open source projects: Galapagos, for multi-FPGA deployment, and hls4ml, for generating ML kernels synthesizable using Vivado HLS. AIgean provides a full end-to-end multi-FPGA/CPU implementation of a neural network. The user supplies a high-level neural network description, and our tool flow is responsible for the synthesizing of the individual layers, partitioning layers across different nodes, as well as the bridging and routing required for these layers to communicate. If the user is an expert in a particular domain and would like to tinker with the implementation details of the neural network, we define a flexible implementation stack for ML that includes the layers of Algorithms, Cluster Deployment & Communication, and Hardware. This allows the user to modify specific layers of abstraction without having to worry about components outside of their area of expertise, highlighting the modularity of AIgean. We demonstrate the effectiveness of AIgean with two use cases: an autoencoder, and ResNet-50 running across 10 and 12 FPGAs. AIgean leverages the FPGA’s strength in low-latency computing, as our implementations target batch-1 implementations.","PeriodicalId":162787,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3482854","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

AIgean, pronounced like the sea, is an open framework to build and deploy machine learning (ML) algorithms on a heterogeneous cluster of devices (CPUs and FPGAs). We leverage two open source projects: Galapagos, for multi-FPGA deployment, and hls4ml, for generating ML kernels synthesizable using Vivado HLS. AIgean provides a full end-to-end multi-FPGA/CPU implementation of a neural network. The user supplies a high-level neural network description, and our tool flow is responsible for the synthesizing of the individual layers, partitioning layers across different nodes, as well as the bridging and routing required for these layers to communicate. If the user is an expert in a particular domain and would like to tinker with the implementation details of the neural network, we define a flexible implementation stack for ML that includes the layers of Algorithms, Cluster Deployment & Communication, and Hardware. This allows the user to modify specific layers of abstraction without having to worry about components outside of their area of expertise, highlighting the modularity of AIgean. We demonstrate the effectiveness of AIgean with two use cases: an autoencoder, and ResNet-50 running across 10 and 12 FPGAs. AIgean leverages the FPGA’s strength in low-latency computing, as our implementations target batch-1 implementations.
在异构集群上部署机器学习的开放框架
AIgean是一个开放的框架,用于在异构设备集群(cpu和fpga)上构建和部署机器学习(ML)算法。我们利用了两个开源项目:用于多fpga部署的Galapagos和用于生成可使用Vivado HLS合成的ML内核的hls4ml。AIgean提供了一个完整的端到端多fpga /CPU神经网络实现。用户提供高级神经网络描述,我们的工具流负责综合各个层,跨不同节点划分层,以及这些层通信所需的桥接和路由。如果用户是特定领域的专家,并且想要修补神经网络的实现细节,我们为ML定义了一个灵活的实现堆栈,包括算法层,集群部署和通信层以及硬件层。这允许用户修改特定的抽象层,而不必担心他们专业领域之外的组件,突出了AIgean的模块化。我们通过两个用例展示了AIgean的有效性:一个自动编码器,以及在10和12个fpga上运行的ResNet-50。AIgean利用FPGA在低延迟计算方面的优势,因为我们的实现目标是批处理1实现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信