FedCOLA: Federated learning with heterogeneous feature concatenation and local acceleration for non-IID data

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
Wu-Chun Chung, Chien-Hu Peng
{"title":"FedCOLA: Federated learning with heterogeneous feature concatenation and local acceleration for non-IID data","authors":"Wu-Chun Chung,&nbsp;Chien-Hu Peng","doi":"10.1016/j.future.2024.107674","DOIUrl":null,"url":null,"abstract":"<div><div>Federated Learning (FL) is an emerging training framework for machine learning to protect data privacy without accessing the original data from each client. However, the participating clients have different computing resources in FL. Clients with insufficient resources may not cooperate in the training due to hardware limitations. The restricted computing speeds may also slow down the overall computing time. In addition, the Non-IID problem happens when data distributions of the clients are varied, which results in lower performance for training. To overcome these problems, this paper proposes a FedCOLA approach to adapt various data distributions among heterogeneous clients. By introducing the feature concatenation and local update mechanism, FedCOLA supports different clients to train the model with different layers. Both communication load and time delay during collaborative training can be reduced. Combined with the adaptive auxiliary model and the personalized model, FedCOLA further improves the testing accuracy under various Non-IID data distributions. To evaluate the performance, this paper considers the effects and analysis of different Non-IID data distributions on distinct methods. The empirical results show that FedCOLA improves the accuracy by 5%, reduces 57% rounds to achieve the same accuracy, and reduces the communication load by 77% in the extremely imbalanced data distribution. Compared with the state-of-the-art methods in a real deployment of heterogeneous clients, FedCOLA reduces the time consumption by 70% to achieve the same accuracy and by 30% to complete 200 training rounds. In conclusion, the proposed FedCOLA not only accommodates various Non-IID data distributions but also supports the heterogeneous clients to train the model of different layers with a significant reduction of the time delay and communication load.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107674"},"PeriodicalIF":6.2000,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24006381","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Federated Learning (FL) is an emerging training framework for machine learning to protect data privacy without accessing the original data from each client. However, the participating clients have different computing resources in FL. Clients with insufficient resources may not cooperate in the training due to hardware limitations. The restricted computing speeds may also slow down the overall computing time. In addition, the Non-IID problem happens when data distributions of the clients are varied, which results in lower performance for training. To overcome these problems, this paper proposes a FedCOLA approach to adapt various data distributions among heterogeneous clients. By introducing the feature concatenation and local update mechanism, FedCOLA supports different clients to train the model with different layers. Both communication load and time delay during collaborative training can be reduced. Combined with the adaptive auxiliary model and the personalized model, FedCOLA further improves the testing accuracy under various Non-IID data distributions. To evaluate the performance, this paper considers the effects and analysis of different Non-IID data distributions on distinct methods. The empirical results show that FedCOLA improves the accuracy by 5%, reduces 57% rounds to achieve the same accuracy, and reduces the communication load by 77% in the extremely imbalanced data distribution. Compared with the state-of-the-art methods in a real deployment of heterogeneous clients, FedCOLA reduces the time consumption by 70% to achieve the same accuracy and by 30% to complete 200 training rounds. In conclusion, the proposed FedCOLA not only accommodates various Non-IID data distributions but also supports the heterogeneous clients to train the model of different layers with a significant reduction of the time delay and communication load.
FedCOLA:针对非iid数据的具有异构特征连接和局部加速的联邦学习
联合学习(FL)是一种新兴的机器学习训练框架,可在不访问每个客户端原始数据的情况下保护数据隐私。然而,FL 中参与的客户端拥有不同的计算资源。由于硬件限制,资源不足的客户端可能无法配合训练。受限的计算速度也可能拖慢整体计算时间。此外,当客户端的数据分布不同时,也会出现非 IID 问题,导致训练性能降低。为了克服这些问题,本文提出了一种 FedCOLA 方法,以适应异构客户端之间的各种数据分布。通过引入特征串联和局部更新机制,FedCOLA 支持不同客户端训练不同层级的模型。协作训练过程中的通信负载和时间延迟都可以减少。结合自适应辅助模型和个性化模型,FedCOLA 进一步提高了各种非 IID 数据分布下的测试精度。为了评估性能,本文考虑了不同的非 IID 数据分布对不同方法的影响和分析。实证结果表明,在极度不平衡的数据分布情况下,FedCOLA 提高了 5%的准确率,减少了 57% 的回合数以达到相同的准确率,并减少了 77% 的通信负载。在异构客户端的实际部署中,与最先进的方法相比,FedCOLA 在达到相同准确度的情况下减少了 70% 的时间消耗,在完成 200 轮训练的情况下减少了 30%的时间消耗。总之,所提出的 FedCOLA 不仅能适应各种非 IID 数据分布,还能支持异构客户端训练不同层的模型,并显著减少了时间延迟和通信负载。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信