{"title":"Public data-enhanced multi-stage differentially private graph neural networks","authors":"Bingbing Zhang , Heyuan Huang , Lingbo Wei , Chi Zhang","doi":"10.1016/j.jisa.2025.103985","DOIUrl":null,"url":null,"abstract":"<div><div>Existing differential privacy algorithms for graph neural networks (GNNs) typically rely on adding noise to private graph data to prevent the leakage of sensitive information. While the addition of noise often leads to significant performance degradation, the incorporation of additional public graph data can effectively mitigate these effects, thereby improving the privacy-utility trade-off in differentially private GNNs. To enhance the trade-off, we propose a method that utilizes public graph data in multi-stage training algorithms. First, to increase the ability to extract useful information from graph data, we introduce a public graph and apply an unsupervised pretraining algorithm, which is then integrated into the private model training through parameter transfer. Second, we utilize multi-stage GNNs to transform the neighborhood aggregation into a preprocessing step to prevent privacy budget accumulation from occurring in the embedding layer, hence enhancing model performance under the same privacy constraints. This method is applicable to both node differential privacy and edge differential privacy in GNNs. Third, for edge differential privacy, we introduce an aggregation perturbation mechanism, which trains an edge prediction model on a basis of node features using the public graph data. We apply this trained model to the private graph data to predict potential neighbors for each node. We then calculate an additional aggregation result based on these predicted neighbors and combine with the aggregation result derived from the true edges, ensuring that the aggregation perturbation result retains valuable information even under very low privacy budgets. Our results show that incorporating public graph data can enhance the accuracy of differentially private GNNs by approximately 5% under the same privacy settings.</div></div>","PeriodicalId":48638,"journal":{"name":"Journal of Information Security and Applications","volume":"89 ","pages":"Article 103985"},"PeriodicalIF":3.8000,"publicationDate":"2025-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Security and Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214212625000237","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Existing differential privacy algorithms for graph neural networks (GNNs) typically rely on adding noise to private graph data to prevent the leakage of sensitive information. While the addition of noise often leads to significant performance degradation, the incorporation of additional public graph data can effectively mitigate these effects, thereby improving the privacy-utility trade-off in differentially private GNNs. To enhance the trade-off, we propose a method that utilizes public graph data in multi-stage training algorithms. First, to increase the ability to extract useful information from graph data, we introduce a public graph and apply an unsupervised pretraining algorithm, which is then integrated into the private model training through parameter transfer. Second, we utilize multi-stage GNNs to transform the neighborhood aggregation into a preprocessing step to prevent privacy budget accumulation from occurring in the embedding layer, hence enhancing model performance under the same privacy constraints. This method is applicable to both node differential privacy and edge differential privacy in GNNs. Third, for edge differential privacy, we introduce an aggregation perturbation mechanism, which trains an edge prediction model on a basis of node features using the public graph data. We apply this trained model to the private graph data to predict potential neighbors for each node. We then calculate an additional aggregation result based on these predicted neighbors and combine with the aggregation result derived from the true edges, ensuring that the aggregation perturbation result retains valuable information even under very low privacy budgets. Our results show that incorporating public graph data can enhance the accuracy of differentially private GNNs by approximately 5% under the same privacy settings.
期刊介绍:
Journal of Information Security and Applications (JISA) focuses on the original research and practice-driven applications with relevance to information security and applications. JISA provides a common linkage between a vibrant scientific and research community and industry professionals by offering a clear view on modern problems and challenges in information security, as well as identifying promising scientific and "best-practice" solutions. JISA issues offer a balance between original research work and innovative industrial approaches by internationally renowned information security experts and researchers.