{"title":"Using Synthetic Data to Mitigate Unfairness and Preserve Privacy through Single-Shot Federated Learning","authors":"Chia-Yuan Wu, Frank E. Curtis, Daniel P. Robinson","doi":"arxiv-2409.09532","DOIUrl":null,"url":null,"abstract":"To address unfairness issues in federated learning (FL), contemporary\napproaches typically use frequent model parameter updates and transmissions\nbetween the clients and server. In such a process, client-specific information\n(e.g., local dataset size or data-related fairness metrics) must be sent to the\nserver to compute, e.g., aggregation weights. All of this results in high\ntransmission costs and the potential leakage of client information. As an\nalternative, we propose a strategy that promotes fair predictions across\nclients without the need to pass information between the clients and server\niteratively and prevents client data leakage. For each client, we first use\ntheir local dataset to obtain a synthetic dataset by solving a bilevel\noptimization problem that addresses unfairness concerns during the learning\nprocess. We then pass each client's synthetic dataset to the server, the\ncollection of which is used to train the server model using conventional\nmachine learning techniques (that do not take fairness metrics into account).\nThus, we eliminate the need to handle fairness-specific aggregation weights\nwhile preserving client privacy. Our approach requires only a single\ncommunication between the clients and the server, thus making it\ncomputationally cost-effective, able to maintain privacy, and able to ensuring\nfairness. We present empirical evidence to demonstrate the advantages of our\napproach. The results illustrate that our method effectively uses synthetic\ndata as a means to mitigate unfairness and preserve client privacy.","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"34 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computers and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09532","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
To address unfairness issues in federated learning (FL), contemporary
approaches typically use frequent model parameter updates and transmissions
between the clients and server. In such a process, client-specific information
(e.g., local dataset size or data-related fairness metrics) must be sent to the
server to compute, e.g., aggregation weights. All of this results in high
transmission costs and the potential leakage of client information. As an
alternative, we propose a strategy that promotes fair predictions across
clients without the need to pass information between the clients and server
iteratively and prevents client data leakage. For each client, we first use
their local dataset to obtain a synthetic dataset by solving a bilevel
optimization problem that addresses unfairness concerns during the learning
process. We then pass each client's synthetic dataset to the server, the
collection of which is used to train the server model using conventional
machine learning techniques (that do not take fairness metrics into account).
Thus, we eliminate the need to handle fairness-specific aggregation weights
while preserving client privacy. Our approach requires only a single
communication between the clients and the server, thus making it
computationally cost-effective, able to maintain privacy, and able to ensuring
fairness. We present empirical evidence to demonstrate the advantages of our
approach. The results illustrate that our method effectively uses synthetic
data as a means to mitigate unfairness and preserve client privacy.