Data-driven modeling of reconfigurable multi-accelerator systems under dynamic workloads

IF 1.9 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Microprocessors and Microsystems Pub Date : 2024-04-09 DOI:10.1016/j.micpro.2024.105050

Juan Encinas, Alfonso Rodríguez, Andrés Otero, Eduardo de la Torre

{"title":"Data-driven modeling of reconfigurable multi-accelerator systems under dynamic workloads","authors":"Juan Encinas, Alfonso Rodríguez, Andrés Otero, Eduardo de la Torre","doi":"10.1016/j.micpro.2024.105050","DOIUrl":null,"url":null,"abstract":"<div><p>Reconfigurable multi-accelerator systems used as computing offloading platforms in edge-cloud continuum scenarios usually have to deal with highly dynamic workloads and operating conditions. In order to properly take advantage of their parallel processing capabilities and increase execution performance for a given workload, these systems need to continuously adapt their configuration (i.e., number and type of accelerators) at run time. When working at the edge, additional requirements such as energy efficiency must be also met. In this paper, Machine Learning techniques are applied to extract predictive models of the execution of different combinations of hardware accelerators on a reconfigurable multi-accelerator platform, aiming at satisfying the previously mentioned continuous optimization needs. One of the key benefits of the proposed approach is that its data-driven models can transparently estimate the impact of the complex interactions between hardware accelerators due to run-time resource contention among them and with the rest of the system, as opposed to traditional modeling approaches that cannot include that information in an easy and scalable way (e.g., analytical models). The proposed models are complemented with a complete infrastructure to generate, execute and monitor dynamic workloads in FPGA-based systems. This infrastructure has been used to (i) quantitatively analyze resource contention in reconfigurable multi-accelerator systems and (ii) produce the training and evaluation datasets for the ML-based models using annotated power consumption and execution performance traces. Experimental results obtained with a reconfigurable multi-accelerator platform based on the ARTICo<sup>3</sup> framework running the MachSuite benchmarks show that the proposed modeling approach is highly effective, with a relative prediction error of less than 5% on average for both power consumption and execution performance. Result also show that the ML-based models achieve high accuracy levels when predicting the impact of resource contention and accelerator interaction on both metrics, with a mean relative prediction error of less than 0.6% and a standard deviation below 4.7% for the worst case.</p></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"107 ","pages":"Article 105050"},"PeriodicalIF":1.9000,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0141933124000450/pdfft?md5=a52d32f5fafee4bda56df513540d6eb8&pid=1-s2.0-S0141933124000450-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microprocessors and Microsystems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141933124000450","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Reconfigurable multi-accelerator systems used as computing offloading platforms in edge-cloud continuum scenarios usually have to deal with highly dynamic workloads and operating conditions. In order to properly take advantage of their parallel processing capabilities and increase execution performance for a given workload, these systems need to continuously adapt their configuration (i.e., number and type of accelerators) at run time. When working at the edge, additional requirements such as energy efficiency must be also met. In this paper, Machine Learning techniques are applied to extract predictive models of the execution of different combinations of hardware accelerators on a reconfigurable multi-accelerator platform, aiming at satisfying the previously mentioned continuous optimization needs. One of the key benefits of the proposed approach is that its data-driven models can transparently estimate the impact of the complex interactions between hardware accelerators due to run-time resource contention among them and with the rest of the system, as opposed to traditional modeling approaches that cannot include that information in an easy and scalable way (e.g., analytical models). The proposed models are complemented with a complete infrastructure to generate, execute and monitor dynamic workloads in FPGA-based systems. This infrastructure has been used to (i) quantitatively analyze resource contention in reconfigurable multi-accelerator systems and (ii) produce the training and evaluation datasets for the ML-based models using annotated power consumption and execution performance traces. Experimental results obtained with a reconfigurable multi-accelerator platform based on the ARTICo³ framework running the MachSuite benchmarks show that the proposed modeling approach is highly effective, with a relative prediction error of less than 5% on average for both power consumption and execution performance. Result also show that the ML-based models achieve high accuracy levels when predicting the impact of resource contention and accelerator interaction on both metrics, with a mean relative prediction error of less than 0.6% and a standard deviation below 4.7% for the worst case.

Abstract Image

查看原文本刊更多论文

动态工作负载下可重构多加速器系统的数据驱动建模

可重构多加速器系统作为边缘-云连续场景中的计算卸载平台，通常需要处理高度动态的工作负载和运行条件。为了适当利用其并行处理能力并提高给定工作负载的执行性能，这些系统需要在运行时不断调整其配置（即加速器的数量和类型）。在边缘工作时，还必须满足能效等额外要求。本文应用机器学习技术，提取可重构多加速器平台上不同硬件加速器组合的执行预测模型，旨在满足前面提到的持续优化需求。与传统建模方法（如分析模型）相比，该方法无法以简便、可扩展的方式包含这些信息，因此无法估算硬件加速器之间因运行时资源争用而产生的复杂交互影响。所提出的模型与完整的基础架构相辅相成，可用于生成、执行和监控基于 FPGA 系统的动态工作负载。该基础架构已被用于：(i) 定量分析可重构多加速器系统中的资源争用情况；(ii) 利用注释功耗和执行性能跟踪为基于 ML 的模型生成训练和评估数据集。在基于 ARTICo3 框架的可重构多加速器平台上运行 MachSuite 基准所获得的实验结果表明，所提出的建模方法非常有效，在功耗和执行性能方面的相对预测误差平均小于 5%。结果还显示，基于 ML 的模型在预测资源争用和加速器交互对这两项指标的影响时达到了很高的准确度，平均相对预测误差小于 0.6%，最坏情况下的标准偏差低于 4.7%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Microprocessors and Microsystems 工程技术-工程：电子与电气

CiteScore

6.90

自引率

3.80%

发文量

204

审稿时长

172 days

期刊介绍： Microprocessors and Microsystems: Embedded Hardware Design (MICPRO) is a journal covering all design and architectural aspects related to embedded systems hardware. This includes different embedded system hardware platforms ranging from custom hardware via reconfigurable systems and application specific processors to general purpose embedded processors. Special emphasis is put on novel complex embedded architectures, such as systems on chip (SoC), systems on a programmable/reconfigurable chip (SoPC) and multi-processor systems on a chip (MPSoC), as well as, their memory and communication methods and structures, such as network-on-chip (NoC). Design automation of such systems including methodologies, techniques, flows and tools for their design, as well as, novel designs of hardware components fall within the scope of this journal. Novel cyber-physical applications that use embedded systems are also central in this journal. While software is not in the main focus of this journal, methods of hardware/software co-design, as well as, application restructuring and mapping to embedded hardware platforms, that consider interplay between software and hardware components with emphasis on hardware, are also in the journal scope.