Privacy-preserving federated learning based on partial low-quality data

Journal of Cloud Computing Pub Date : 2024-03-18 DOI:10.1186/s13677-024-00618-8

Huiyong Wang, Qi Wang, Yong Ding, Shijie Tang, Yujue Wang

{"title":"Privacy-preserving federated learning based on partial low-quality data","authors":"Huiyong Wang, Qi Wang, Yong Ding, Shijie Tang, Yujue Wang","doi":"10.1186/s13677-024-00618-8","DOIUrl":null,"url":null,"abstract":"Traditional machine learning requires collecting data from participants for training, which may lead to malicious acquisition of privacy in participants’ data. Federated learning provides a method to protect participants’ data privacy by transferring the training process from a centralized server to terminal devices. However, the server may still obtain participants’ privacy through inference attacks and other methods. In addition, the data provided by participants varies in quality, and the excessive involvement of low-quality data in the training process can render the model unusable, which is an important issue in current mainstream federated learning. To address the aforementioned issues, this paper proposes a Privacy Preserving Federated Learning Scheme with Partial Low-Quality Data (PPFL-LQDP). It can achieve good training results while allowing participants to utilize partial low-quality data, thereby enhancing the privacy and security of the federated learning scheme. Specifically, we use a distributed Paillier cryptographic mechanism to protect the privacy and security of participants’ data during the Federated training process. Additionally, we construct composite evaluation values for the data held by participants to reduce the involvement of low-quality data, thereby minimizing the negative impact of such data on the model. Through experiments on the MNIST dataset, we demonstrate that this scheme can complete the model training of federated learning with the participation of partial low-quality data, while effectively protecting the security and privacy of participants’ data. Comparisons with related schemes also show that our scheme has good overall performance.","PeriodicalId":501257,"journal":{"name":"Journal of Cloud Computing","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13677-024-00618-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Traditional machine learning requires collecting data from participants for training, which may lead to malicious acquisition of privacy in participants’ data. Federated learning provides a method to protect participants’ data privacy by transferring the training process from a centralized server to terminal devices. However, the server may still obtain participants’ privacy through inference attacks and other methods. In addition, the data provided by participants varies in quality, and the excessive involvement of low-quality data in the training process can render the model unusable, which is an important issue in current mainstream federated learning. To address the aforementioned issues, this paper proposes a Privacy Preserving Federated Learning Scheme with Partial Low-Quality Data (PPFL-LQDP). It can achieve good training results while allowing participants to utilize partial low-quality data, thereby enhancing the privacy and security of the federated learning scheme. Specifically, we use a distributed Paillier cryptographic mechanism to protect the privacy and security of participants’ data during the Federated training process. Additionally, we construct composite evaluation values for the data held by participants to reduce the involvement of low-quality data, thereby minimizing the negative impact of such data on the model. Through experiments on the MNIST dataset, we demonstrate that this scheme can complete the model training of federated learning with the participation of partial low-quality data, while effectively protecting the security and privacy of participants’ data. Comparisons with related schemes also show that our scheme has good overall performance.

查看原文本刊更多论文

基于部分低质量数据的隐私保护联合学习

传统的机器学习需要收集参与者的数据进行训练，这可能会导致恶意获取参与者的数据隐私。联合学习提供了一种保护参与者数据隐私的方法，即把训练过程从集中服务器转移到终端设备上。不过，服务器仍有可能通过推理攻击和其他方法获取参与者的隐私。此外，参与者提供的数据质量参差不齐，低质量数据过多地参与训练过程会导致模型无法使用，这也是当前主流联合学习中存在的重要问题。针对上述问题，本文提出了一种具有部分低质量数据的隐私保护联合学习方案（PPFL-LQDP）。它既能取得良好的训练效果，又能允许参与者利用部分低质量数据，从而提高联合学习方案的隐私性和安全性。具体来说，我们使用分布式 Paillier 加密机制来保护联合训练过程中参与者数据的隐私和安全。此外，我们还为参与者持有的数据构建了复合评估值，以减少低质量数据的参与，从而将此类数据对模型的负面影响降至最低。通过在 MNIST 数据集上的实验，我们证明该方案可以在部分低质量数据参与的情况下完成联合学习的模型训练，同时有效保护参与者数据的安全和隐私。与相关方案的比较也表明，我们的方案具有良好的整体性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Cloud Computing

自引率

0.00%

发文量