{"title":"垂直联邦学习预测过程中隐私泄露的综合分析","authors":"Xue Jiang, Xuebing Zhou, Jens Grossklags","doi":"10.2478/popets-2022-0045","DOIUrl":null,"url":null,"abstract":"Abstract Vertical federated learning (VFL), a variant of federated learning, has recently attracted increasing attention. An active party having the true labels jointly trains a model with other parties (referred to as passive parties) in order to use more features to achieve higher model accuracy. During the prediction phase, all the parties collaboratively compute the predicted confidence scores of each target record and the results will be finally returned to the active party. However, a recent study by Luo et al. [28] pointed out that the active party can use these confidence scores to reconstruct passive-party features and cause severe privacy leakage. In this paper, we conduct a comprehensive analysis of privacy leakage in VFL frameworks during the prediction phase. Our study improves on previous work [28] regarding two aspects. We first design a general gradient-based reconstruction attack framework that can be flexibly applied to simple logistic regression models as well as multi-layer neural networks. Moreover, besides performing the attack under the white-box setting, we give the first attempt to conduct the attack under the black-box setting. Extensive experiments on a number of real-world datasets show that our proposed attack is effective under different settings and can achieve at best twice or thrice of a reduction of attack error compared to previous work [28]. We further analyze a list of potential mitigation approaches and compare their privacy-utility performances. Experimental results demonstrate that privacy leakage from the confidence scores is a substantial privacy risk in VFL frameworks during the prediction phase, which cannot be simply solved by crypto-based confidentiality approaches. On the other hand, processing the confidence scores with information compression and randomization approaches can provide strengthened privacy protection.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":"2022 1","pages":"263 - 281"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":"{\"title\":\"Comprehensive Analysis of Privacy Leakage in Vertical Federated Learning During Prediction\",\"authors\":\"Xue Jiang, Xuebing Zhou, Jens Grossklags\",\"doi\":\"10.2478/popets-2022-0045\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Vertical federated learning (VFL), a variant of federated learning, has recently attracted increasing attention. An active party having the true labels jointly trains a model with other parties (referred to as passive parties) in order to use more features to achieve higher model accuracy. During the prediction phase, all the parties collaboratively compute the predicted confidence scores of each target record and the results will be finally returned to the active party. However, a recent study by Luo et al. [28] pointed out that the active party can use these confidence scores to reconstruct passive-party features and cause severe privacy leakage. In this paper, we conduct a comprehensive analysis of privacy leakage in VFL frameworks during the prediction phase. Our study improves on previous work [28] regarding two aspects. We first design a general gradient-based reconstruction attack framework that can be flexibly applied to simple logistic regression models as well as multi-layer neural networks. Moreover, besides performing the attack under the white-box setting, we give the first attempt to conduct the attack under the black-box setting. Extensive experiments on a number of real-world datasets show that our proposed attack is effective under different settings and can achieve at best twice or thrice of a reduction of attack error compared to previous work [28]. We further analyze a list of potential mitigation approaches and compare their privacy-utility performances. Experimental results demonstrate that privacy leakage from the confidence scores is a substantial privacy risk in VFL frameworks during the prediction phase, which cannot be simply solved by crypto-based confidentiality approaches. On the other hand, processing the confidence scores with information compression and randomization approaches can provide strengthened privacy protection.\",\"PeriodicalId\":74556,\"journal\":{\"name\":\"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium\",\"volume\":\"2022 1\",\"pages\":\"263 - 281\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2478/popets-2022-0045\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/popets-2022-0045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comprehensive Analysis of Privacy Leakage in Vertical Federated Learning During Prediction
Abstract Vertical federated learning (VFL), a variant of federated learning, has recently attracted increasing attention. An active party having the true labels jointly trains a model with other parties (referred to as passive parties) in order to use more features to achieve higher model accuracy. During the prediction phase, all the parties collaboratively compute the predicted confidence scores of each target record and the results will be finally returned to the active party. However, a recent study by Luo et al. [28] pointed out that the active party can use these confidence scores to reconstruct passive-party features and cause severe privacy leakage. In this paper, we conduct a comprehensive analysis of privacy leakage in VFL frameworks during the prediction phase. Our study improves on previous work [28] regarding two aspects. We first design a general gradient-based reconstruction attack framework that can be flexibly applied to simple logistic regression models as well as multi-layer neural networks. Moreover, besides performing the attack under the white-box setting, we give the first attempt to conduct the attack under the black-box setting. Extensive experiments on a number of real-world datasets show that our proposed attack is effective under different settings and can achieve at best twice or thrice of a reduction of attack error compared to previous work [28]. We further analyze a list of potential mitigation approaches and compare their privacy-utility performances. Experimental results demonstrate that privacy leakage from the confidence scores is a substantial privacy risk in VFL frameworks during the prediction phase, which cannot be simply solved by crypto-based confidentiality approaches. On the other hand, processing the confidence scores with information compression and randomization approaches can provide strengthened privacy protection.