{"title":"联邦XGBoost的自适应梯度隐私保护算法","authors":"Hongyi Cai, Jianping Cai, Lan Sun","doi":"10.1145/3590003.3590051","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) is a novel machine learning framework in which machine learning models are built jointly by multiple parties. We investigate the privacy preservation of XGBoost, a gradient boosting decision tree (GBDT) model, in the context of FL. While recent work relies on cryptographic schemes to preserve the privacy of model gradients, these methods are computationally expensive. In this paper, we propose an adaptive gradient privacy-preserving algorithm based on differential privacy (DP), which is more computationally efficient. Our algorithm perturbs individual data by computing an adaptive gradient mean per sample and adding appropriate noise during XGBoost training, while still making the perturbed gradient data available. The training accuracy and communication efficiency of the model are guaranteed under the premise of satisfying the definition of DP. We show the proposed algorithm outperforms other DP methods in terms of prediction accuracy and approaches the lossless federated XGBoost model while being more efficient.","PeriodicalId":340225,"journal":{"name":"Proceedings of the 2023 2nd Asia Conference on Algorithms, Computing and Machine Learning","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Adaptive Gradient Privacy-Preserving Algorithm for Federated XGBoost\",\"authors\":\"Hongyi Cai, Jianping Cai, Lan Sun\",\"doi\":\"10.1145/3590003.3590051\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated learning (FL) is a novel machine learning framework in which machine learning models are built jointly by multiple parties. We investigate the privacy preservation of XGBoost, a gradient boosting decision tree (GBDT) model, in the context of FL. While recent work relies on cryptographic schemes to preserve the privacy of model gradients, these methods are computationally expensive. In this paper, we propose an adaptive gradient privacy-preserving algorithm based on differential privacy (DP), which is more computationally efficient. Our algorithm perturbs individual data by computing an adaptive gradient mean per sample and adding appropriate noise during XGBoost training, while still making the perturbed gradient data available. The training accuracy and communication efficiency of the model are guaranteed under the premise of satisfying the definition of DP. We show the proposed algorithm outperforms other DP methods in terms of prediction accuracy and approaches the lossless federated XGBoost model while being more efficient.\",\"PeriodicalId\":340225,\"journal\":{\"name\":\"Proceedings of the 2023 2nd Asia Conference on Algorithms, Computing and Machine Learning\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 2nd Asia Conference on Algorithms, Computing and Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3590003.3590051\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 2nd Asia Conference on Algorithms, Computing and Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3590003.3590051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Adaptive Gradient Privacy-Preserving Algorithm for Federated XGBoost
Federated learning (FL) is a novel machine learning framework in which machine learning models are built jointly by multiple parties. We investigate the privacy preservation of XGBoost, a gradient boosting decision tree (GBDT) model, in the context of FL. While recent work relies on cryptographic schemes to preserve the privacy of model gradients, these methods are computationally expensive. In this paper, we propose an adaptive gradient privacy-preserving algorithm based on differential privacy (DP), which is more computationally efficient. Our algorithm perturbs individual data by computing an adaptive gradient mean per sample and adding appropriate noise during XGBoost training, while still making the perturbed gradient data available. The training accuracy and communication efficiency of the model are guaranteed under the premise of satisfying the definition of DP. We show the proposed algorithm outperforms other DP methods in terms of prediction accuracy and approaches the lossless federated XGBoost model while being more efficient.