Lorenzo Farinelli, Daniele Valentino De Vincenti, Andrea Damiani, Luca Stornaiuolo, Rolando Brondolin, M. Santambrogio, D. Sciuto
{"title":"石膏:用于加速分布式算法的嵌入式fpga集群编排器","authors":"Lorenzo Farinelli, Daniele Valentino De Vincenti, Andrea Damiani, Luca Stornaiuolo, Rolando Brondolin, M. Santambrogio, D. Sciuto","doi":"10.1109/IPDPSW52791.2021.00023","DOIUrl":null,"url":null,"abstract":"The increasing use of real-time data-intensive applications and the growing interest in Heterogeneous Architectures have led to the need for increasingly complex embedded computing systems. An example of this is the research carried out by both the scientific community and companies toward embedded multi-FPGA systems for the implementation of the inference phase of Convolutional Neural Networks.In this paper, we focus on optimizing the management system of these embedded FPGA-based distributed systems. We extend the state-of-the-art FARD framework to data-intensive applications in an embedded scenario. Our orchestration and management infrastructure benefits from compiled language and is accessible to end-users by the means of Python APIs, which provides a simple way to interact with the cluster and design apps to run on the embedded nodes. The proposed prototype system consists of a PYNQ-based cluster of multiple FPGAs and has been evaluated by running an FPGA-based You Only Look Once (YOLO) image classification algorithm.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Plaster: an Embedded FPGA-based Cluster Orchestrator for Accelerated Distributed Algorithms\",\"authors\":\"Lorenzo Farinelli, Daniele Valentino De Vincenti, Andrea Damiani, Luca Stornaiuolo, Rolando Brondolin, M. Santambrogio, D. Sciuto\",\"doi\":\"10.1109/IPDPSW52791.2021.00023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The increasing use of real-time data-intensive applications and the growing interest in Heterogeneous Architectures have led to the need for increasingly complex embedded computing systems. An example of this is the research carried out by both the scientific community and companies toward embedded multi-FPGA systems for the implementation of the inference phase of Convolutional Neural Networks.In this paper, we focus on optimizing the management system of these embedded FPGA-based distributed systems. We extend the state-of-the-art FARD framework to data-intensive applications in an embedded scenario. Our orchestration and management infrastructure benefits from compiled language and is accessible to end-users by the means of Python APIs, which provides a simple way to interact with the cluster and design apps to run on the embedded nodes. The proposed prototype system consists of a PYNQ-based cluster of multiple FPGAs and has been evaluated by running an FPGA-based You Only Look Once (YOLO) image classification algorithm.\",\"PeriodicalId\":170832,\"journal\":{\"name\":\"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPSW52791.2021.00023\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW52791.2021.00023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
实时数据密集型应用的日益增长以及对异构体系结构的日益增长的兴趣导致了对日益复杂的嵌入式计算系统的需求。这方面的一个例子是科学界和公司对嵌入式多fpga系统进行的研究,用于实现卷积神经网络的推理阶段。本文重点对这些基于fpga的嵌入式分布式系统的管理系统进行了优化。我们将最先进的FARD框架扩展到嵌入式场景中的数据密集型应用程序。我们的编排和管理基础设施受益于编译语言,最终用户可以通过Python api访问,它提供了一种与集群交互和设计应用程序以在嵌入式节点上运行的简单方法。所提出的原型系统由基于pynq的多个fpga集群组成,并通过运行基于fpga的You Only Look Once (YOLO)图像分类算法进行了评估。
Plaster: an Embedded FPGA-based Cluster Orchestrator for Accelerated Distributed Algorithms
The increasing use of real-time data-intensive applications and the growing interest in Heterogeneous Architectures have led to the need for increasingly complex embedded computing systems. An example of this is the research carried out by both the scientific community and companies toward embedded multi-FPGA systems for the implementation of the inference phase of Convolutional Neural Networks.In this paper, we focus on optimizing the management system of these embedded FPGA-based distributed systems. We extend the state-of-the-art FARD framework to data-intensive applications in an embedded scenario. Our orchestration and management infrastructure benefits from compiled language and is accessible to end-users by the means of Python APIs, which provides a simple way to interact with the cluster and design apps to run on the embedded nodes. The proposed prototype system consists of a PYNQ-based cluster of multiple FPGAs and has been evaluated by running an FPGA-based You Only Look Once (YOLO) image classification algorithm.