CFL-HC: A Coded Federated Learning Framework for Heterogeneous Computing Scenarios

2021 IEEE Global Communications Conference (GLOBECOM) Pub Date : 2021-12-01 DOI:10.1109/GLOBECOM46510.2021.9685962

Dong Wang, Baoqian Wang, Jinran Zhang, K. Lu, Junfei Xie, Yan Wan, Shengli Fu

{"title":"CFL-HC: A Coded Federated Learning Framework for Heterogeneous Computing Scenarios","authors":"Dong Wang, Baoqian Wang, Jinran Zhang, K. Lu, Junfei Xie, Yan Wan, Shengli Fu","doi":"10.1109/GLOBECOM46510.2021.9685962","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) is a promising machine learning paradigm because it allows distributed edge devices to collaboratively train a model without sharing their raw data. In practice, a major challenge to FL is that edge devices are heterogeneous, so slow devices may compromise the convergence of model training. To address such a challenge, several recent studies have suggested different solutions, in which a promising scheme is to utilize coded computing to facilitate the training of linear models. Nevertheless, the existing coded FL (CFL) scheme is limited by a fixed coding redundancy parameter, and a weight matrix used in the existing design may introduce unnecessary errors. In this paper, we tackle these issues and propose a novel framework, namely CFL-HC, to facilitate CFL in heterogeneous computing scenarios. In our framework, we consider a computing system consisting of a central server and multiple computing devices with original or coded datasets. Then we specify an expected number of input-output pairs that are used in one round. Within such a framework, we formulate an optimization problem to find the best deadline of each training round and the optimal size of the computing task allocated to each computing device. We then design a two-step optimization scheme to obtain the optimal solution. To evaluate the proposed framework, we develop a real CFL system using the message passing interface platform. Based on this system, we conduct numerical experiments, which demonstrate the advantages of the proposed framework, in terms of both accuracy and convergence speed.","PeriodicalId":200641,"journal":{"name":"2021 IEEE Global Communications Conference (GLOBECOM)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Global Communications Conference (GLOBECOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GLOBECOM46510.2021.9685962","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Federated learning (FL) is a promising machine learning paradigm because it allows distributed edge devices to collaboratively train a model without sharing their raw data. In practice, a major challenge to FL is that edge devices are heterogeneous, so slow devices may compromise the convergence of model training. To address such a challenge, several recent studies have suggested different solutions, in which a promising scheme is to utilize coded computing to facilitate the training of linear models. Nevertheless, the existing coded FL (CFL) scheme is limited by a fixed coding redundancy parameter, and a weight matrix used in the existing design may introduce unnecessary errors. In this paper, we tackle these issues and propose a novel framework, namely CFL-HC, to facilitate CFL in heterogeneous computing scenarios. In our framework, we consider a computing system consisting of a central server and multiple computing devices with original or coded datasets. Then we specify an expected number of input-output pairs that are used in one round. Within such a framework, we formulate an optimization problem to find the best deadline of each training round and the optimal size of the computing task allocated to each computing device. We then design a two-step optimization scheme to obtain the optimal solution. To evaluate the proposed framework, we develop a real CFL system using the message passing interface platform. Based on this system, we conduct numerical experiments, which demonstrate the advantages of the proposed framework, in terms of both accuracy and convergence speed.

查看原文本刊更多论文

CFL-HC:异构计算场景下的编码联邦学习框架

联邦学习(FL)是一种很有前途的机器学习范例，因为它允许分布式边缘设备在不共享原始数据的情况下协作训练模型。在实践中，FL面临的一个主要挑战是边缘设备是异构的，因此慢速设备可能会损害模型训练的收敛性。为了应对这一挑战，最近的一些研究提出了不同的解决方案，其中一个有前途的方案是利用编码计算来促进线性模型的训练。然而，现有的编码FL (CFL)方案受到固定的编码冗余参数的限制，并且现有设计中使用的权矩阵可能会引入不必要的误差。在本文中，我们解决了这些问题，并提出了一个新的框架，即CFL- hc，以促进CFL在异构计算场景中的应用。在我们的框架中，我们考虑一个由中央服务器和多个具有原始或编码数据集的计算设备组成的计算系统。然后，我们指定在一轮中使用的预期数量的输入-输出对。在此框架下，我们制定了一个优化问题，以找到每个训练轮的最佳截止日期和分配给每个计算设备的计算任务的最优大小。然后，我们设计了一个两步优化方案来获得最优解。为了评估所提出的框架，我们利用消息传递接口平台开发了一个实际的CFL系统。在此基础上进行了数值实验，验证了该框架在精度和收敛速度方面的优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE Global Communications Conference (GLOBECOM)

自引率

0.00%

发文量