Straggler-Resilient Federated Learning: Tackling Computation Heterogeneity With Layer-Wise Partial Model Training in Mobile Edge Network

IF 7.9 2区 计算机科学 Q1 ENGINEERING, MULTIDISCIPLINARY
Hongda Wu;Ping Wang;C V Aswartha Narayana
{"title":"Straggler-Resilient Federated Learning: Tackling Computation Heterogeneity With Layer-Wise Partial Model Training in Mobile Edge Network","authors":"Hongda Wu;Ping Wang;C V Aswartha Narayana","doi":"10.1109/TNSE.2025.3577910","DOIUrl":null,"url":null,"abstract":"Federated Learning (FL) enables many resource-limited devices to train a model collaboratively without data sharing. However, many existing works focus on model-homogeneous FL, where the global and local models are the same size, ignoring the inherently heterogeneous computational capabilities of different devices and restricting resource-constrained devices from contributing to FL. In this paper, we consider model-heterogeneous FL and propose Federated Partial Model Training (<monospace>FedPMT</monospace>), where devices with smaller computational capabilities work on partial models (subsets of the global model) and contribute to the global model. Different from Dropout-based partial model generations, which remove neurons in (hidden) model layers at random, model training in <monospace>FedPMT</monospace> is achieved from the back-propagation perspective. As such, all devices in <monospace>FedPMT</monospace> prioritize the most crucial parts of the global model. Theoretical analysis shows that the proposed partial model training design has a similar convergence rate to the widely adopted Federated Averaging (FedAvg) algorithm, <inline-formula><tex-math>$\\mathcal {O}(1/T)$</tex-math></inline-formula>, with the sub-optimality gap enlarged by a constant factor related to the model splitting design in <monospace>FedPMT</monospace>. Empirical results show that <monospace>FedPMT</monospace> significantly outperforms the existing partial model training designs, FedDrop and HeteroFL, especially on complex tasks. Meanwhile, compared to the popular model-homogeneous benchmark, FedAvg, <monospace>FedPMT</monospace> reaches the learning target in a shorter completion time, thus achieving a better trade-off between learning accuracy and completion time.","PeriodicalId":54229,"journal":{"name":"IEEE Transactions on Network Science and Engineering","volume":"12 6","pages":"4922-4938"},"PeriodicalIF":7.9000,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Network Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11027799/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Federated Learning (FL) enables many resource-limited devices to train a model collaboratively without data sharing. However, many existing works focus on model-homogeneous FL, where the global and local models are the same size, ignoring the inherently heterogeneous computational capabilities of different devices and restricting resource-constrained devices from contributing to FL. In this paper, we consider model-heterogeneous FL and propose Federated Partial Model Training (FedPMT), where devices with smaller computational capabilities work on partial models (subsets of the global model) and contribute to the global model. Different from Dropout-based partial model generations, which remove neurons in (hidden) model layers at random, model training in FedPMT is achieved from the back-propagation perspective. As such, all devices in FedPMT prioritize the most crucial parts of the global model. Theoretical analysis shows that the proposed partial model training design has a similar convergence rate to the widely adopted Federated Averaging (FedAvg) algorithm, $\mathcal {O}(1/T)$, with the sub-optimality gap enlarged by a constant factor related to the model splitting design in FedPMT. Empirical results show that FedPMT significantly outperforms the existing partial model training designs, FedDrop and HeteroFL, especially on complex tasks. Meanwhile, compared to the popular model-homogeneous benchmark, FedAvg, FedPMT reaches the learning target in a shorter completion time, thus achieving a better trade-off between learning accuracy and completion time.
离散-弹性联邦学习:用分层部分模型训练解决移动边缘网络的计算异质性
联邦学习(FL)使许多资源有限的设备能够在不共享数据的情况下协作训练模型。然而,许多现有的工作都集中在模型同质的FL上,其中全局和局部模型大小相同,忽略了不同设备固有的异构计算能力,限制了资源受限设备对FL的贡献。在本文中,我们考虑模型异构FL并提出联邦部分模型训练(FedPMT)。具有较小计算能力的设备在局部模型(全局模型的子集)上工作,并对全局模型做出贡献。与基于dropout的部分模型世代随机去除(隐藏)模型层中的神经元不同,FedPMT中的模型训练是从反向传播的角度实现的。因此,FedPMT中的所有设备优先考虑全球模型中最关键的部分。理论分析表明,所提出的部分模型训练设计与广泛采用的联邦平均(fedag)算法$\mathcal {O}(1/T)$具有相似的收敛速度,但次最优性差距被FedPMT中与模型分割设计相关的常数因子放大。实证结果表明,FedPMT显著优于现有的部分模型训练设计、FedDrop和HeteroFL,特别是在复杂任务上。同时,与流行的模型同构基准fedag相比,FedPMT在更短的完成时间内达到了学习目标,从而在学习精度和完成时间之间实现了更好的权衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Network Science and Engineering
IEEE Transactions on Network Science and Engineering Engineering-Control and Systems Engineering
CiteScore
12.60
自引率
9.10%
发文量
393
期刊介绍: The proposed journal, called the IEEE Transactions on Network Science and Engineering (TNSE), is committed to timely publishing of peer-reviewed technical articles that deal with the theory and applications of network science and the interconnections among the elements in a system that form a network. In particular, the IEEE Transactions on Network Science and Engineering publishes articles on understanding, prediction, and control of structures and behaviors of networks at the fundamental level. The types of networks covered include physical or engineered networks, information networks, biological networks, semantic networks, economic networks, social networks, and ecological networks. Aimed at discovering common principles that govern network structures, network functionalities and behaviors of networks, the journal seeks articles on understanding, prediction, and control of structures and behaviors of networks. Another trans-disciplinary focus of the IEEE Transactions on Network Science and Engineering is the interactions between and co-evolution of different genres of networks.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信