{"title":"Community Privacy using the Sparse Vector Technique for Graph Statistics","authors":"Hara Seon, Hyeongjun Choi, B. Song, Jiwon Yoon","doi":"10.1145/3571697.3571709","DOIUrl":null,"url":null,"abstract":"Various attacks have occurred to extract information on a specific person from social networks. Differential privacy (DP) is one of the solutions for privacy disclosure issues. However, the privacy issue in social networks makes people reluctant to provide their data. This circumstance causes a lack of data for data analysis. DP in small data degrades data utility more than in big data when we add the same amount of noise. We propose Community Attributes Privacy-preserving Method (CAPM) using the sparse vector technique that maintains a constant privacy level even in small data to mitigate this issue in this paper. CAPM obfuscates raw graph data to protect the network structure in a small network. This technique can improve the data utility performance compared to the existing model. We also suggest a privacy parameter that sets the privacy budget based on the similarity of communities in a network to reflect the network topology and contribute to raising the accuracy of a synthetic graph. In a node privacy view, we inject noise into the edges of central nodes in a community. Finally, we evaluate CAPM with real networks regarding statistical utility and privacy protection. We show that CAPM has an error rate of the number of edges up to 20 percent and its structural entropy is less than 17 percent of the error rate on average. CAPM improves the average clustering coefficient by 82 percent from the recent modeling algorithm. In addition, a maximum 18 percent error rate in modularity outperforms the baseline whose 43 percent of error rate. The evaluation results show that the CAPM generates synthetic social graphs targeting their relations of communities and performs better in data utility.","PeriodicalId":400139,"journal":{"name":"Proceedings of the 2022 European Symposium on Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 European Symposium on Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3571697.3571709","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Various attacks have occurred to extract information on a specific person from social networks. Differential privacy (DP) is one of the solutions for privacy disclosure issues. However, the privacy issue in social networks makes people reluctant to provide their data. This circumstance causes a lack of data for data analysis. DP in small data degrades data utility more than in big data when we add the same amount of noise. We propose Community Attributes Privacy-preserving Method (CAPM) using the sparse vector technique that maintains a constant privacy level even in small data to mitigate this issue in this paper. CAPM obfuscates raw graph data to protect the network structure in a small network. This technique can improve the data utility performance compared to the existing model. We also suggest a privacy parameter that sets the privacy budget based on the similarity of communities in a network to reflect the network topology and contribute to raising the accuracy of a synthetic graph. In a node privacy view, we inject noise into the edges of central nodes in a community. Finally, we evaluate CAPM with real networks regarding statistical utility and privacy protection. We show that CAPM has an error rate of the number of edges up to 20 percent and its structural entropy is less than 17 percent of the error rate on average. CAPM improves the average clustering coefficient by 82 percent from the recent modeling algorithm. In addition, a maximum 18 percent error rate in modularity outperforms the baseline whose 43 percent of error rate. The evaluation results show that the CAPM generates synthetic social graphs targeting their relations of communities and performs better in data utility.