{"title":"How Valuable is Your Data? Optimizing Client Recruitment in Federated Learning","authors":"Yichen Ruan;Xiaoxi Zhang;Carlee Joe-Wong","doi":"10.1109/TNET.2024.3422264","DOIUrl":null,"url":null,"abstract":"Federated learning allows distributed clients to train a shared machine learning model while preserving user privacy. In this framework, user devices (i.e., clients) perform local iterations of the learning algorithm on their data. These updates are periodically aggregated to form a shared model. Thus, a client represents the bundle of the user data, the device, and the user’s willingness to participate: since participating in federated learning requires clients to expend resources and reveal some information about their data, users may require some form of compensation to contribute to the training process. Recruiting more users generally results in higher accuracy, but slower completion time and higher cost. We propose the first work to theoretically analyze the resulting performance tradeoffs in deciding which clients to recruit for the federated learning algorithm. Our framework accounts for both accuracy (training and testing) and efficiency (completion time and cost) metrics. We provide solutions to this NP-Hard optimization problem and verify the value of client recruitment in experiments on synthetic and real-world data. The results of this work can serve as a guideline for the real-world deployment of federated learning and an initial investigation of the client recruitment problem.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 5","pages":"4207-4221"},"PeriodicalIF":3.0000,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10597379/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Federated learning allows distributed clients to train a shared machine learning model while preserving user privacy. In this framework, user devices (i.e., clients) perform local iterations of the learning algorithm on their data. These updates are periodically aggregated to form a shared model. Thus, a client represents the bundle of the user data, the device, and the user’s willingness to participate: since participating in federated learning requires clients to expend resources and reveal some information about their data, users may require some form of compensation to contribute to the training process. Recruiting more users generally results in higher accuracy, but slower completion time and higher cost. We propose the first work to theoretically analyze the resulting performance tradeoffs in deciding which clients to recruit for the federated learning algorithm. Our framework accounts for both accuracy (training and testing) and efficiency (completion time and cost) metrics. We provide solutions to this NP-Hard optimization problem and verify the value of client recruitment in experiments on synthetic and real-world data. The results of this work can serve as a guideline for the real-world deployment of federated learning and an initial investigation of the client recruitment problem.
期刊介绍:
The IEEE/ACM Transactions on Networking’s high-level objective is to publish high-quality, original research results derived from theoretical or experimental exploration of the area of communication/computer networking, covering all sorts of information transport networks over all sorts of physical layer technologies, both wireline (all kinds of guided media: e.g., copper, optical) and wireless (e.g., radio-frequency, acoustic (e.g., underwater), infra-red), or hybrids of these. The journal welcomes applied contributions reporting on novel experiences and experiments with actual systems.