{"title":"局部差分私有数据聚合中隐私参数的选择与验证","authors":"Snehkumar Shahani, Abraham Jibi, R. Venkateswaran","doi":"10.1145/3471287.3471306","DOIUrl":null,"url":null,"abstract":"Acquiring and aggregating data from a group of individuals is crucial for studying their general behavior. Differentially Private (DP) techniques, characterized by the parameter ϵ, help to protect Individually Identifiable Data (IID) of individuals participating in such data collection. However, such techniques affect the usefulness of the data leading to a trade-off between usefulness and privacy, thereby making the selection of ϵ an important problem before data acquisition. In this work, we use a mathematical formalism to estimate usefulness and privacy for sum query as aggregate analysis for the local model of privacy. The mathematical relation enables the application of a variety of optimization techniques, discussed in the work, to select an optimal value of ϵ. Existing methods for selecting ϵ are based on financial parameters, but they heavily rely on past data and domain knowledge which may not be available in many cases. To address this, we have provided Knee-point based recommendations along with a selection criterion to choose the method of recommendation depending on the availability of information. This allows analysts to take enlightened decisions while negotiating the value of ϵ. Our experiments on synthetic and real-world datasets unambiguously demonstrate the strength of the mathematical model and the recommended values","PeriodicalId":306474,"journal":{"name":"2021 the 5th International Conference on Information System and Data Mining","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Selection and Verification of Privacy Parameters for Local Differentially Private Data Aggregation\",\"authors\":\"Snehkumar Shahani, Abraham Jibi, R. Venkateswaran\",\"doi\":\"10.1145/3471287.3471306\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Acquiring and aggregating data from a group of individuals is crucial for studying their general behavior. Differentially Private (DP) techniques, characterized by the parameter ϵ, help to protect Individually Identifiable Data (IID) of individuals participating in such data collection. However, such techniques affect the usefulness of the data leading to a trade-off between usefulness and privacy, thereby making the selection of ϵ an important problem before data acquisition. In this work, we use a mathematical formalism to estimate usefulness and privacy for sum query as aggregate analysis for the local model of privacy. The mathematical relation enables the application of a variety of optimization techniques, discussed in the work, to select an optimal value of ϵ. Existing methods for selecting ϵ are based on financial parameters, but they heavily rely on past data and domain knowledge which may not be available in many cases. To address this, we have provided Knee-point based recommendations along with a selection criterion to choose the method of recommendation depending on the availability of information. This allows analysts to take enlightened decisions while negotiating the value of ϵ. Our experiments on synthetic and real-world datasets unambiguously demonstrate the strength of the mathematical model and the recommended values\",\"PeriodicalId\":306474,\"journal\":{\"name\":\"2021 the 5th International Conference on Information System and Data Mining\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 the 5th International Conference on Information System and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3471287.3471306\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 the 5th International Conference on Information System and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3471287.3471306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Selection and Verification of Privacy Parameters for Local Differentially Private Data Aggregation
Acquiring and aggregating data from a group of individuals is crucial for studying their general behavior. Differentially Private (DP) techniques, characterized by the parameter ϵ, help to protect Individually Identifiable Data (IID) of individuals participating in such data collection. However, such techniques affect the usefulness of the data leading to a trade-off between usefulness and privacy, thereby making the selection of ϵ an important problem before data acquisition. In this work, we use a mathematical formalism to estimate usefulness and privacy for sum query as aggregate analysis for the local model of privacy. The mathematical relation enables the application of a variety of optimization techniques, discussed in the work, to select an optimal value of ϵ. Existing methods for selecting ϵ are based on financial parameters, but they heavily rely on past data and domain knowledge which may not be available in many cases. To address this, we have provided Knee-point based recommendations along with a selection criterion to choose the method of recommendation depending on the availability of information. This allows analysts to take enlightened decisions while negotiating the value of ϵ. Our experiments on synthetic and real-world datasets unambiguously demonstrate the strength of the mathematical model and the recommended values