{"title":"关于差异隐私的统计学观点:假设检验、表征和布莱克韦尔定理","authors":"Weijie J. Su","doi":"arxiv-2409.09558","DOIUrl":null,"url":null,"abstract":"Differential privacy is widely considered the formal privacy for\nprivacy-preserving data analysis due to its robust and rigorous guarantees,\nwith increasingly broad adoption in public services, academia, and industry.\nDespite originating in the cryptographic context, in this review paper we argue\nthat, fundamentally, differential privacy can be considered a \\textit{pure}\nstatistical concept. By leveraging a theorem due to David Blackwell, our focus\nis to demonstrate that the definition of differential privacy can be formally\nmotivated from a hypothesis testing perspective, thereby showing that\nhypothesis testing is not merely convenient but also the right language for\nreasoning about differential privacy. This insight leads to the definition of\n$f$-differential privacy, which extends other differential privacy definitions\nthrough a representation theorem. We review techniques that render\n$f$-differential privacy a unified framework for analyzing privacy bounds in\ndata analysis and machine learning. Applications of this differential privacy\ndefinition to private deep learning, private convex optimization, shuffled\nmechanisms, and U.S.~Census data are discussed to highlight the benefits of\nanalyzing privacy bounds under this framework compared to existing\nalternatives.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"56 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Statistical Viewpoint on Differential Privacy: Hypothesis Testing, Representation and Blackwell's Theorem\",\"authors\":\"Weijie J. Su\",\"doi\":\"arxiv-2409.09558\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Differential privacy is widely considered the formal privacy for\\nprivacy-preserving data analysis due to its robust and rigorous guarantees,\\nwith increasingly broad adoption in public services, academia, and industry.\\nDespite originating in the cryptographic context, in this review paper we argue\\nthat, fundamentally, differential privacy can be considered a \\\\textit{pure}\\nstatistical concept. By leveraging a theorem due to David Blackwell, our focus\\nis to demonstrate that the definition of differential privacy can be formally\\nmotivated from a hypothesis testing perspective, thereby showing that\\nhypothesis testing is not merely convenient but also the right language for\\nreasoning about differential privacy. This insight leads to the definition of\\n$f$-differential privacy, which extends other differential privacy definitions\\nthrough a representation theorem. We review techniques that render\\n$f$-differential privacy a unified framework for analyzing privacy bounds in\\ndata analysis and machine learning. Applications of this differential privacy\\ndefinition to private deep learning, private convex optimization, shuffled\\nmechanisms, and U.S.~Census data are discussed to highlight the benefits of\\nanalyzing privacy bounds under this framework compared to existing\\nalternatives.\",\"PeriodicalId\":501379,\"journal\":{\"name\":\"arXiv - STAT - Statistics Theory\",\"volume\":\"56 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Statistics Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.09558\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09558","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Statistical Viewpoint on Differential Privacy: Hypothesis Testing, Representation and Blackwell's Theorem
Differential privacy is widely considered the formal privacy for
privacy-preserving data analysis due to its robust and rigorous guarantees,
with increasingly broad adoption in public services, academia, and industry.
Despite originating in the cryptographic context, in this review paper we argue
that, fundamentally, differential privacy can be considered a \textit{pure}
statistical concept. By leveraging a theorem due to David Blackwell, our focus
is to demonstrate that the definition of differential privacy can be formally
motivated from a hypothesis testing perspective, thereby showing that
hypothesis testing is not merely convenient but also the right language for
reasoning about differential privacy. This insight leads to the definition of
$f$-differential privacy, which extends other differential privacy definitions
through a representation theorem. We review techniques that render
$f$-differential privacy a unified framework for analyzing privacy bounds in
data analysis and machine learning. Applications of this differential privacy
definition to private deep learning, private convex optimization, shuffled
mechanisms, and U.S.~Census data are discussed to highlight the benefits of
analyzing privacy bounds under this framework compared to existing
alternatives.