Energy-Aware Heterogeneous Federated Learning via Approximate DNN Accelerators

IF 2.9 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Kilian Pfeiffer;Konstantinos Balaskas;Kostas Siozios;Jörg Henkel
{"title":"Energy-Aware Heterogeneous Federated Learning via Approximate DNN Accelerators","authors":"Kilian Pfeiffer;Konstantinos Balaskas;Kostas Siozios;Jörg Henkel","doi":"10.1109/TCAD.2024.3509793","DOIUrl":null,"url":null,"abstract":"In Federated Learning (FL), devices that participate in the training usually have heterogeneous resources, i.e., energy availability. In current deployments of FL, devices that do not fulfill certain hardware requirements are often dropped from the collaborative training. However, dropping devices in FL can degrade training accuracy and introduce bias or unfairness. Several works have tackled this problem on an algorithm level, e.g., by letting constrained devices train a subset of the server neural network (NN) model. However, it has been observed that these techniques are not effective w.r.t. accuracy. Importantly, they make simplistic assumptions about devices’ resources via indirect metrics, such as multiply accumulate (MAC) operations or peak memory requirements. We observe that memory access costs (that are currently not considered in simplistic metrics) have a significant impact on the energy consumption. In this work, for the first time, we consider on-device accelerator design for FL with heterogeneous devices. We utilize compressed arithmetic formats and approximate computing, targeting to satisfy limited energy budgets. Using a hardware-aware energy model, we observe that, contrary to the state of the art’s moderate energy reduction, our technique allows for lowering the energy requirements (by <inline-formula> <tex-math>$4\\times $ </tex-math></inline-formula>) while maintaining higher accuracy.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 6","pages":"2054-2066"},"PeriodicalIF":2.9000,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10771979/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

In Federated Learning (FL), devices that participate in the training usually have heterogeneous resources, i.e., energy availability. In current deployments of FL, devices that do not fulfill certain hardware requirements are often dropped from the collaborative training. However, dropping devices in FL can degrade training accuracy and introduce bias or unfairness. Several works have tackled this problem on an algorithm level, e.g., by letting constrained devices train a subset of the server neural network (NN) model. However, it has been observed that these techniques are not effective w.r.t. accuracy. Importantly, they make simplistic assumptions about devices’ resources via indirect metrics, such as multiply accumulate (MAC) operations or peak memory requirements. We observe that memory access costs (that are currently not considered in simplistic metrics) have a significant impact on the energy consumption. In this work, for the first time, we consider on-device accelerator design for FL with heterogeneous devices. We utilize compressed arithmetic formats and approximate computing, targeting to satisfy limited energy budgets. Using a hardware-aware energy model, we observe that, contrary to the state of the art’s moderate energy reduction, our technique allows for lowering the energy requirements (by $4\times $ ) while maintaining higher accuracy.
基于近似DNN加速器的能量感知异构联邦学习
在联邦学习(FL)中,参与训练的设备通常具有异构资源,即能量可用性。在当前的FL部署中,不满足某些硬件要求的设备经常从协作培训中删除。然而,FL中的丢弃装置会降低训练精度并引入偏差或不公平。一些工作已经在算法层面解决了这个问题,例如,通过让受约束的设备训练服务器神经网络(NN)模型的子集。然而,已经观察到这些技术不是有效的w.r.t.准确性。重要的是,它们通过间接指标对设备的资源做出了简单的假设,例如乘法累积(MAC)操作或峰值内存需求。我们观察到内存访问成本(目前没有在简单的指标中考虑)对能源消耗有重大影响。在这项工作中,我们首次考虑了具有异构器件的FL的器件上加速器设计。我们使用压缩算法格式和近似计算,以满足有限的能源预算。使用硬件感知的能量模型,我们观察到,与现有技术的适度能量减少状态相反,我们的技术允许在保持更高精度的同时降低能量需求(降低4倍)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.60
自引率
13.80%
发文量
500
审稿时长
7 months
期刊介绍: The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信