Energy-Aware Heterogeneous Federated Learning via Approximate DNN Accelerators

IF 2.9 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2024-11-29 DOI:10.1109/TCAD.2024.3509793

Kilian Pfeiffer;Konstantinos Balaskas;Kostas Siozios;Jörg Henkel

{"title":"Energy-Aware Heterogeneous Federated Learning via Approximate DNN Accelerators","authors":"Kilian Pfeiffer;Konstantinos Balaskas;Kostas Siozios;Jörg Henkel","doi":"10.1109/TCAD.2024.3509793","DOIUrl":null,"url":null,"abstract":"In Federated Learning (FL), devices that participate in the training usually have heterogeneous resources, i.e., energy availability. In current deployments of FL, devices that do not fulfill certain hardware requirements are often dropped from the collaborative training. However, dropping devices in FL can degrade training accuracy and introduce bias or unfairness. Several works have tackled this problem on an algorithm level, e.g., by letting constrained devices train a subset of the server neural network (NN) model. However, it has been observed that these techniques are not effective w.r.t. accuracy. Importantly, they make simplistic assumptions about devices’ resources via indirect metrics, such as multiply accumulate (MAC) operations or peak memory requirements. We observe that memory access costs (that are currently not considered in simplistic metrics) have a significant impact on the energy consumption. In this work, for the first time, we consider on-device accelerator design for FL with heterogeneous devices. We utilize compressed arithmetic formats and approximate computing, targeting to satisfy limited energy budgets. Using a hardware-aware energy model, we observe that, contrary to the state of the art’s moderate energy reduction, our technique allows for lowering the energy requirements (by <inline-formula> <tex-math>$4\\times $ </tex-math></inline-formula>) while maintaining higher accuracy.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 6","pages":"2054-2066"},"PeriodicalIF":2.9000,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10771979/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

In Federated Learning (FL), devices that participate in the training usually have heterogeneous resources, i.e., energy availability. In current deployments of FL, devices that do not fulfill certain hardware requirements are often dropped from the collaborative training. However, dropping devices in FL can degrade training accuracy and introduce bias or unfairness. Several works have tackled this problem on an algorithm level, e.g., by letting constrained devices train a subset of the server neural network (NN) model. However, it has been observed that these techniques are not effective w.r.t. accuracy. Importantly, they make simplistic assumptions about devices’ resources via indirect metrics, such as multiply accumulate (MAC) operations or peak memory requirements. We observe that memory access costs (that are currently not considered in simplistic metrics) have a significant impact on the energy consumption. In this work, for the first time, we consider on-device accelerator design for FL with heterogeneous devices. We utilize compressed arithmetic formats and approximate computing, targeting to satisfy limited energy budgets. Using a hardware-aware energy model, we observe that, contrary to the state of the art’s moderate energy reduction, our technique allows for lowering the energy requirements (by

$4\times $

) while maintaining higher accuracy.

查看原文本刊更多论文

基于近似DNN加速器的能量感知异构联邦学习

在联邦学习（FL）中，参与训练的设备通常具有异构资源，即能量可用性。在当前的FL部署中，不满足某些硬件要求的设备经常从协作培训中删除。然而，FL中的丢弃装置会降低训练精度并引入偏差或不公平。一些工作已经在算法层面解决了这个问题，例如，通过让受约束的设备训练服务器神经网络（NN）模型的子集。然而，已经观察到这些技术不是有效的w.r.t.准确性。重要的是，它们通过间接指标对设备的资源做出了简单的假设，例如乘法累积（MAC）操作或峰值内存需求。我们观察到内存访问成本（目前没有在简单的指标中考虑）对能源消耗有重大影响。在这项工作中，我们首次考虑了具有异构器件的FL的器件上加速器设计。我们使用压缩算法格式和近似计算，以满足有限的能源预算。使用硬件感知的能量模型，我们观察到，与现有技术的适度能量减少状态相反，我们的技术允许在保持更高精度的同时降低能量需求（降低4倍）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 工程技术-工程：电子与电气

CiteScore

5.60

自引率

13.80%

发文量

500

审稿时长

7 months

期刊介绍： The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.