FedCSS: Joint Client-and-Sample Selection for Hard Sample-Aware Noise-Robust Federated Learning

Anran Li, Yue Cao, Jiabao Guo, Hongyi Peng, Qing Guo, Han Yu
{"title":"FedCSS: Joint Client-and-Sample Selection for Hard Sample-Aware Noise-Robust Federated Learning","authors":"Anran Li, Yue Cao, Jiabao Guo, Hongyi Peng, Qing Guo, Han Yu","doi":"10.1145/3617332","DOIUrl":null,"url":null,"abstract":"Federated Learning (FL) enables a large number of data owners (a.k.a. FL clients) to jointly train a machine learning model without disclosing private local data. The importance of local data samples to the FL model vary widely. This is exacerbated by the presence of noisy data, which exhibit large losses similar to important (hard) samples. Currently, there lacks an FL approach that can effectively distinguish hard samples (which are beneficial) from noisy samples (which are harmful). To bridge this gap, we propose the Federated Client and Sample Selection (FedCSS) approach. It is a bilevel optimization approach for FL client-and-sample selection to achieve hard sample-aware noise-robust learning in a privacy preserving manner. It performs meta-learning based online approximation to iteratively update global FL models, select the most positively influential samples and deal with training data noise. Theoretical analysis shows that it is guaranteed to converge in an efficient manner. Experimental comparison against six state-of-the-art baselines on five real-world datasets in the presence of data noise and heterogeneity shows that it achieves up to 26.4% higher test accuracy, while saving communication and computation costs by at least 41.5% and 1.2%, respectively.","PeriodicalId":498157,"journal":{"name":"Proceedings of the ACM on Management of Data","volume":"34 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3617332","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Federated Learning (FL) enables a large number of data owners (a.k.a. FL clients) to jointly train a machine learning model without disclosing private local data. The importance of local data samples to the FL model vary widely. This is exacerbated by the presence of noisy data, which exhibit large losses similar to important (hard) samples. Currently, there lacks an FL approach that can effectively distinguish hard samples (which are beneficial) from noisy samples (which are harmful). To bridge this gap, we propose the Federated Client and Sample Selection (FedCSS) approach. It is a bilevel optimization approach for FL client-and-sample selection to achieve hard sample-aware noise-robust learning in a privacy preserving manner. It performs meta-learning based online approximation to iteratively update global FL models, select the most positively influential samples and deal with training data noise. Theoretical analysis shows that it is guaranteed to converge in an efficient manner. Experimental comparison against six state-of-the-art baselines on five real-world datasets in the presence of data noise and heterogeneity shows that it achieves up to 26.4% higher test accuracy, while saving communication and computation costs by at least 41.5% and 1.2%, respectively.
硬样本感知噪声鲁棒联邦学习的联合客户-样本选择
联邦学习(FL)使大量数据所有者(也称为FL客户端)能够在不泄露私有本地数据的情况下共同训练机器学习模型。局部数据样本对FL模型的重要性差别很大。噪声数据的存在加剧了这种情况,这些数据表现出与重要(硬)样本相似的巨大损失。目前,缺乏一种能够有效区分硬样本(有益)和噪声样本(有害)的FL方法。为了弥补这一差距,我们提出了联邦客户端和样本选择(federalclient and Sample Selection, federcss)方法。这是一种双层优化方法,用于FL客户端和样本选择,以保护隐私的方式实现硬样本感知噪声鲁棒学习。它执行基于元学习的在线逼近来迭代更新全局FL模型,选择最具积极影响的样本并处理训练数据噪声。理论分析表明,该算法能保证有效收敛。在存在数据噪声和异质性的五个真实数据集上与六个最先进的基线进行的实验比较表明,该方法的测试精度提高了26.4%,同时通信和计算成本分别节省了至少41.5%和1.2%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信