Local Epochs Inefficiency Caused by Device Heterogeneity in Federated Learning

Wirel. Commun. Mob. Comput. Pub Date : 2022-01-06 DOI:10.1155/2022/6887040

Y. Zeng, Xin Wang, Junfeng Yuan, Jilin Zhang, Jian Wan

{"title":"Local Epochs Inefficiency Caused by Device Heterogeneity in Federated Learning","authors":"Y. Zeng, Xin Wang, Junfeng Yuan, Jilin Zhang, Jian Wan","doi":"10.1155/2022/6887040","DOIUrl":null,"url":null,"abstract":"Federated learning is a new framework of machine learning, it trains models locally on multiple clients and then uploads local models to the server for model aggregation iteratively until the model converges. In most cases, the local epochs of all clients are set to the same value in federated learning. In practice, the clients are usually heterogeneous, which leads to the inconsistent training speed of clients. The faster clients will remain idle for a long time to wait for the slower clients, which prolongs the model training time. As the time cost of clients’ local training can reflect the clients’ training speed, and it can be used to guide the dynamic setting of local epochs, we propose a method based on deep learning to predict the training time of models on heterogeneous clients. First, a neural network is designed to extract the influence of different model features on training time. Second, we propose a dimensionality reduction rule to extract the key features which have a great impact on training time based on the influence of model features. Finally, we use the key features extracted by the dimensionality reduction rule to train the time prediction model. Our experiments show that, compared with the current prediction method, our method reduces 30% of model features and 25% of training data for the convolutional layer, 20% of model features and 20% of training data for the dense layer, while maintaining the same level of prediction error.","PeriodicalId":23995,"journal":{"name":"Wirel. Commun. Mob. Comput.","volume":"9 1","pages":"6887040:1-6887040:15"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wirel. Commun. Mob. Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2022/6887040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Federated learning is a new framework of machine learning, it trains models locally on multiple clients and then uploads local models to the server for model aggregation iteratively until the model converges. In most cases, the local epochs of all clients are set to the same value in federated learning. In practice, the clients are usually heterogeneous, which leads to the inconsistent training speed of clients. The faster clients will remain idle for a long time to wait for the slower clients, which prolongs the model training time. As the time cost of clients’ local training can reflect the clients’ training speed, and it can be used to guide the dynamic setting of local epochs, we propose a method based on deep learning to predict the training time of models on heterogeneous clients. First, a neural network is designed to extract the influence of different model features on training time. Second, we propose a dimensionality reduction rule to extract the key features which have a great impact on training time based on the influence of model features. Finally, we use the key features extracted by the dimensionality reduction rule to train the time prediction model. Our experiments show that, compared with the current prediction method, our method reduces 30% of model features and 25% of training data for the convolutional layer, 20% of model features and 20% of training data for the dense layer, while maintaining the same level of prediction error.

查看原文本刊更多论文

联邦学习中设备异构导致的局部时代效率低下

联邦学习是一种新的机器学习框架，它在多个客户端本地训练模型，然后将本地模型迭代上传到服务器进行模型聚合，直到模型收敛。在大多数情况下，在联邦学习中，所有客户端的本地epoch都被设置为相同的值。在实践中，客户端通常是异构的，这导致客户端训练速度不一致。速度较快的客户端会长时间处于空闲状态，等待速度较慢的客户端，从而延长了模型的训练时间。由于客户端局部训练的时间成本可以反映客户端的训练速度，并且可以用来指导局部epoch的动态设置，我们提出了一种基于深度学习的模型在异构客户端的训练时间预测方法。首先，设计神经网络提取不同模型特征对训练时间的影响;其次，基于模型特征的影响，提出降维规则，提取对训练时间影响较大的关键特征;最后，利用降维规则提取的关键特征对时间预测模型进行训练。我们的实验表明，与现有的预测方法相比，我们的方法在保持相同预测误差水平的情况下，卷积层减少了30%的模型特征和25%的训练数据，密集层减少了20%的模型特征和20%的训练数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Wirel. Commun. Mob. Comput.

自引率

0.00%

发文量