Community Privacy using the Sparse Vector Technique for Graph Statistics

Proceedings of the 2022 European Symposium on Software Engineering Pub Date : 2022-10-27 DOI:10.1145/3571697.3571709

Hara Seon, Hyeongjun Choi, B. Song, Jiwon Yoon

{"title":"Community Privacy using the Sparse Vector Technique for Graph Statistics","authors":"Hara Seon, Hyeongjun Choi, B. Song, Jiwon Yoon","doi":"10.1145/3571697.3571709","DOIUrl":null,"url":null,"abstract":"Various attacks have occurred to extract information on a specific person from social networks. Differential privacy (DP) is one of the solutions for privacy disclosure issues. However, the privacy issue in social networks makes people reluctant to provide their data. This circumstance causes a lack of data for data analysis. DP in small data degrades data utility more than in big data when we add the same amount of noise. We propose Community Attributes Privacy-preserving Method (CAPM) using the sparse vector technique that maintains a constant privacy level even in small data to mitigate this issue in this paper. CAPM obfuscates raw graph data to protect the network structure in a small network. This technique can improve the data utility performance compared to the existing model. We also suggest a privacy parameter that sets the privacy budget based on the similarity of communities in a network to reflect the network topology and contribute to raising the accuracy of a synthetic graph. In a node privacy view, we inject noise into the edges of central nodes in a community. Finally, we evaluate CAPM with real networks regarding statistical utility and privacy protection. We show that CAPM has an error rate of the number of edges up to 20 percent and its structural entropy is less than 17 percent of the error rate on average. CAPM improves the average clustering coefficient by 82 percent from the recent modeling algorithm. In addition, a maximum 18 percent error rate in modularity outperforms the baseline whose 43 percent of error rate. The evaluation results show that the CAPM generates synthetic social graphs targeting their relations of communities and performs better in data utility.","PeriodicalId":400139,"journal":{"name":"Proceedings of the 2022 European Symposium on Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 European Symposium on Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3571697.3571709","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Various attacks have occurred to extract information on a specific person from social networks. Differential privacy (DP) is one of the solutions for privacy disclosure issues. However, the privacy issue in social networks makes people reluctant to provide their data. This circumstance causes a lack of data for data analysis. DP in small data degrades data utility more than in big data when we add the same amount of noise. We propose Community Attributes Privacy-preserving Method (CAPM) using the sparse vector technique that maintains a constant privacy level even in small data to mitigate this issue in this paper. CAPM obfuscates raw graph data to protect the network structure in a small network. This technique can improve the data utility performance compared to the existing model. We also suggest a privacy parameter that sets the privacy budget based on the similarity of communities in a network to reflect the network topology and contribute to raising the accuracy of a synthetic graph. In a node privacy view, we inject noise into the edges of central nodes in a community. Finally, we evaluate CAPM with real networks regarding statistical utility and privacy protection. We show that CAPM has an error rate of the number of edges up to 20 percent and its structural entropy is less than 17 percent of the error rate on average. CAPM improves the average clustering coefficient by 82 percent from the recent modeling algorithm. In addition, a maximum 18 percent error rate in modularity outperforms the baseline whose 43 percent of error rate. The evaluation results show that the CAPM generates synthetic social graphs targeting their relations of communities and performs better in data utility.

查看原文本刊更多论文

基于稀疏向量技术的社区隐私图统计

从社交网络中提取特定个人信息的各种攻击已经发生。差分隐私(DP)是解决隐私披露问题的一种方法。然而，社交网络中的隐私问题让人们不愿提供自己的数据。这种情况导致数据分析缺乏数据。当我们添加相同数量的噪声时，小数据中的DP比大数据中的DP更能降低数据效用。为了缓解这一问题，本文提出了使用稀疏向量技术的社区属性隐私保护方法(CAPM)，该方法即使在小数据中也保持恒定的隐私级别。在小型网络中，CAPM对原始图形数据进行模糊处理，以保护网络结构。与现有模型相比，该技术可以提高数据实用性能。我们还提出了一个隐私参数，该参数根据网络中社区的相似性设置隐私预算，以反映网络拓扑，并有助于提高合成图的准确性。在节点隐私视图中，我们在社区中心节点的边缘注入噪声。最后，我们从统计效用和隐私保护的角度对真实网络的CAPM进行了评估。结果表明，CAPM的边数错误率高达20%，其结构熵平均小于错误率的17%。与最近的建模算法相比，CAPM的平均聚类系数提高了82%。此外，模块化中最大18%的错误率优于基线的43%错误率。评价结果表明，CAPM能够生成针对社区关系的合成社交图，具有较好的数据效用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2022 European Symposium on Software Engineering

自引率

0.00%

发文量