SHARE: Shaping Data Distribution at Edge for Communication-Efficient Hierarchical Federated Learning

Yongheng Deng, Feng Lyu, Ju Ren, Yongmin Zhang, Yuezhi Zhou, Yaoxue Zhang, Yuanyuan Yang
{"title":"SHARE: Shaping Data Distribution at Edge for Communication-Efficient Hierarchical Federated Learning","authors":"Yongheng Deng, Feng Lyu, Ju Ren, Yongmin Zhang, Yuezhi Zhou, Yaoxue Zhang, Yuanyuan Yang","doi":"10.1109/ICDCS51616.2021.00012","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) can enable distributed model training over mobile nodes without sharing privacy-sensitive raw data. However, to achieve efficient FL, one significant challenge is the prohibitive communication overhead to commit model updates since frequent cloud model aggregations are usually required to reach a target accuracy, especially when the data distributions at mobile nodes are imbalanced. With pilot experiments, it is verified that frequent cloud model aggregations can be avoided without performance degradation if model aggregations can be conducted at edge. To this end, we shed light on the hierarchical federated learning (HFL) framework, where a subset of distributed nodes are selected as edge aggregators to conduct edge aggregations. Particularly, under the HFL framework, we formulate a communication cost minimization (CCM) problem to minimize the communication cost raised by edge/cloud aggregations with making decisions on edge aggregator selection and distributed node association. Inspired by the insight that the potential of HFL lies in the data distribution at edge aggregators, we propose SHARE, i.e., SHaping dAta distRibution at Edge, to transform and solve the CCM problem. In SHARE, we divide the original problem into two sub-problems to minimize the per-round communication cost and mean Kullback-Leibler divergence of edge aggregator data, and devise two light-weight algorithms to solve them, respectively. Extensive experiments under various settings are carried out to corroborate the efficacy of SHARE.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS51616.2021.00012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28

Abstract

Federated learning (FL) can enable distributed model training over mobile nodes without sharing privacy-sensitive raw data. However, to achieve efficient FL, one significant challenge is the prohibitive communication overhead to commit model updates since frequent cloud model aggregations are usually required to reach a target accuracy, especially when the data distributions at mobile nodes are imbalanced. With pilot experiments, it is verified that frequent cloud model aggregations can be avoided without performance degradation if model aggregations can be conducted at edge. To this end, we shed light on the hierarchical federated learning (HFL) framework, where a subset of distributed nodes are selected as edge aggregators to conduct edge aggregations. Particularly, under the HFL framework, we formulate a communication cost minimization (CCM) problem to minimize the communication cost raised by edge/cloud aggregations with making decisions on edge aggregator selection and distributed node association. Inspired by the insight that the potential of HFL lies in the data distribution at edge aggregators, we propose SHARE, i.e., SHaping dAta distRibution at Edge, to transform and solve the CCM problem. In SHARE, we divide the original problem into two sub-problems to minimize the per-round communication cost and mean Kullback-Leibler divergence of edge aggregator data, and devise two light-weight algorithms to solve them, respectively. Extensive experiments under various settings are carried out to corroborate the efficacy of SHARE.
共享:在通信高效的分层联邦学习边缘塑造数据分布
联邦学习(FL)可以在移动节点上进行分布式模型训练,而无需共享隐私敏感的原始数据。然而,为了实现高效的FL,一个重要的挑战是提交模型更新的通信开销过高,因为通常需要频繁的云模型聚合来达到目标精度,特别是当移动节点上的数据分布不平衡时。通过先导实验,验证了在边缘处进行模型聚合可以避免频繁的云模型聚合而不降低性能。为此,我们阐明了分层联邦学习(HFL)框架,其中选择分布式节点的子集作为边缘聚合器进行边缘聚合。特别是,在HFL框架下,通过对边缘聚合器的选择和分布式节点的关联进行决策,提出了通信成本最小化(CCM)问题,以最小化边缘/云聚合带来的通信成本。我们认识到HFL的潜力在于边缘聚合器的数据分布,因此我们提出了SHARE,即边缘数据分布的塑造,来改造和解决CCM问题。在SHARE算法中,我们以最小化每轮通信代价和边缘聚合器数据的平均Kullback-Leibler散度为目标,将原问题分解为两个子问题,并分别设计了两种轻量级算法进行求解。在不同的设置下进行了大量的实验来证实SHARE的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信