{"title":"Statistical Data Privacy: A Song of Privacy and Utility","authors":"Aleksandra Slavković, Jeremy Seeman","doi":"10.1146/annurev-statistics-033121-112921","DOIUrl":null,"url":null,"abstract":"To quantify trade-offs between increasing demand for open data sharing and concerns about sensitive information disclosure, statistical data privacy (SDP) methodology analyzes data release mechanisms that sanitize outputs based on confidential data. Two dominant frameworks exist: statistical disclosure control (SDC) and the more recent differential privacy (DP). Despite framing differences, both SDC and DP share the same statistical problems at their core. For inference problems, either we may design optimal release mechanisms and associated estimators that satisfy bounds on disclosure risk measures, or we may adjust existing sanitized output to create new statistically valid and optimal estimators. Regardless of design or adjustment, in evaluating risk and utility, valid statistical inferences from mechanism outputs require uncertainty quantification that accounts for the effect of the sanitization mechanism that introduces bias and/or variance. In this review, we discuss the statistical foundations common to both SDC and DP, highlight major developments in SDP, and present exciting open research problems in private inference.","PeriodicalId":48855,"journal":{"name":"Annual Review of Statistics and Its Application","volume":"44 1","pages":"0"},"PeriodicalIF":7.4000,"publicationDate":"2023-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Review of Statistics and Its Application","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1146/annurev-statistics-033121-112921","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 5
Abstract
To quantify trade-offs between increasing demand for open data sharing and concerns about sensitive information disclosure, statistical data privacy (SDP) methodology analyzes data release mechanisms that sanitize outputs based on confidential data. Two dominant frameworks exist: statistical disclosure control (SDC) and the more recent differential privacy (DP). Despite framing differences, both SDC and DP share the same statistical problems at their core. For inference problems, either we may design optimal release mechanisms and associated estimators that satisfy bounds on disclosure risk measures, or we may adjust existing sanitized output to create new statistically valid and optimal estimators. Regardless of design or adjustment, in evaluating risk and utility, valid statistical inferences from mechanism outputs require uncertainty quantification that accounts for the effect of the sanitization mechanism that introduces bias and/or variance. In this review, we discuss the statistical foundations common to both SDC and DP, highlight major developments in SDP, and present exciting open research problems in private inference.
期刊介绍:
The Annual Review of Statistics and Its Application publishes comprehensive review articles focusing on methodological advancements in statistics and the utilization of computational tools facilitating these advancements. It is abstracted and indexed in Scopus, Science Citation Index Expanded, and Inspec.