Huiyong Wang, Qi Wang, Yong Ding, Shijie Tang, Yujue Wang
{"title":"Privacy-preserving federated learning based on partial low-quality data","authors":"Huiyong Wang, Qi Wang, Yong Ding, Shijie Tang, Yujue Wang","doi":"10.1186/s13677-024-00618-8","DOIUrl":null,"url":null,"abstract":"Traditional machine learning requires collecting data from participants for training, which may lead to malicious acquisition of privacy in participants’ data. Federated learning provides a method to protect participants’ data privacy by transferring the training process from a centralized server to terminal devices. However, the server may still obtain participants’ privacy through inference attacks and other methods. In addition, the data provided by participants varies in quality, and the excessive involvement of low-quality data in the training process can render the model unusable, which is an important issue in current mainstream federated learning. To address the aforementioned issues, this paper proposes a Privacy Preserving Federated Learning Scheme with Partial Low-Quality Data (PPFL-LQDP). It can achieve good training results while allowing participants to utilize partial low-quality data, thereby enhancing the privacy and security of the federated learning scheme. Specifically, we use a distributed Paillier cryptographic mechanism to protect the privacy and security of participants’ data during the Federated training process. Additionally, we construct composite evaluation values for the data held by participants to reduce the involvement of low-quality data, thereby minimizing the negative impact of such data on the model. Through experiments on the MNIST dataset, we demonstrate that this scheme can complete the model training of federated learning with the participation of partial low-quality data, while effectively protecting the security and privacy of participants’ data. Comparisons with related schemes also show that our scheme has good overall performance.","PeriodicalId":501257,"journal":{"name":"Journal of Cloud Computing","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13677-024-00618-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Traditional machine learning requires collecting data from participants for training, which may lead to malicious acquisition of privacy in participants’ data. Federated learning provides a method to protect participants’ data privacy by transferring the training process from a centralized server to terminal devices. However, the server may still obtain participants’ privacy through inference attacks and other methods. In addition, the data provided by participants varies in quality, and the excessive involvement of low-quality data in the training process can render the model unusable, which is an important issue in current mainstream federated learning. To address the aforementioned issues, this paper proposes a Privacy Preserving Federated Learning Scheme with Partial Low-Quality Data (PPFL-LQDP). It can achieve good training results while allowing participants to utilize partial low-quality data, thereby enhancing the privacy and security of the federated learning scheme. Specifically, we use a distributed Paillier cryptographic mechanism to protect the privacy and security of participants’ data during the Federated training process. Additionally, we construct composite evaluation values for the data held by participants to reduce the involvement of low-quality data, thereby minimizing the negative impact of such data on the model. Through experiments on the MNIST dataset, we demonstrate that this scheme can complete the model training of federated learning with the participation of partial low-quality data, while effectively protecting the security and privacy of participants’ data. Comparisons with related schemes also show that our scheme has good overall performance.