关于差异隐私的统计学观点：假设检验、表征和布莱克韦尔定理

arXiv - STAT - Statistics Theory Pub Date : 2024-09-14 DOI:arxiv-2409.09558

Weijie J. Su

{"title":"关于差异隐私的统计学观点：假设检验、表征和布莱克韦尔定理","authors":"Weijie J. Su","doi":"arxiv-2409.09558","DOIUrl":null,"url":null,"abstract":"Differential privacy is widely considered the formal privacy for\nprivacy-preserving data analysis due to its robust and rigorous guarantees,\nwith increasingly broad adoption in public services, academia, and industry.\nDespite originating in the cryptographic context, in this review paper we argue\nthat, fundamentally, differential privacy can be considered a \\textit{pure}\nstatistical concept. By leveraging a theorem due to David Blackwell, our focus\nis to demonstrate that the definition of differential privacy can be formally\nmotivated from a hypothesis testing perspective, thereby showing that\nhypothesis testing is not merely convenient but also the right language for\nreasoning about differential privacy. This insight leads to the definition of\n$f$-differential privacy, which extends other differential privacy definitions\nthrough a representation theorem. We review techniques that render\n$f$-differential privacy a unified framework for analyzing privacy bounds in\ndata analysis and machine learning. Applications of this differential privacy\ndefinition to private deep learning, private convex optimization, shuffled\nmechanisms, and U.S.~Census data are discussed to highlight the benefits of\nanalyzing privacy bounds under this framework compared to existing\nalternatives.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"56 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Statistical Viewpoint on Differential Privacy: Hypothesis Testing, Representation and Blackwell's Theorem\",\"authors\":\"Weijie J. Su\",\"doi\":\"arxiv-2409.09558\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Differential privacy is widely considered the formal privacy for\\nprivacy-preserving data analysis due to its robust and rigorous guarantees,\\nwith increasingly broad adoption in public services, academia, and industry.\\nDespite originating in the cryptographic context, in this review paper we argue\\nthat, fundamentally, differential privacy can be considered a \\\\textit{pure}\\nstatistical concept. By leveraging a theorem due to David Blackwell, our focus\\nis to demonstrate that the definition of differential privacy can be formally\\nmotivated from a hypothesis testing perspective, thereby showing that\\nhypothesis testing is not merely convenient but also the right language for\\nreasoning about differential privacy. This insight leads to the definition of\\n$f$-differential privacy, which extends other differential privacy definitions\\nthrough a representation theorem. We review techniques that render\\n$f$-differential privacy a unified framework for analyzing privacy bounds in\\ndata analysis and machine learning. Applications of this differential privacy\\ndefinition to private deep learning, private convex optimization, shuffled\\nmechanisms, and U.S.~Census data are discussed to highlight the benefits of\\nanalyzing privacy bounds under this framework compared to existing\\nalternatives.\",\"PeriodicalId\":501379,\"journal\":{\"name\":\"arXiv - STAT - Statistics Theory\",\"volume\":\"56 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Statistics Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.09558\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09558","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

差分隐私因其稳健而严格的保证而被广泛认为是保护隐私的数据分析的正式隐私，并在公共服务、学术界和工业界得到越来越广泛的应用。尽管差分隐私起源于密码学背景，但在这篇综述论文中，我们认为从根本上讲，差分隐私可以被视为一个（文本{纯}统计概念。通过利用大卫-布莱克韦尔（David Blackwell）提出的一个定理，我们的重点是证明差分隐私的定义可以从假设检验的角度进行正式推导，从而表明假设检验不仅方便，而且是推理差分隐私的正确语言。这一见解引出了$f$差分隐私的定义，它通过表示定理扩展了其他差分隐私的定义。我们回顾了一些技术，这些技术使f$差分隐私成为分析数据分析和机器学习中隐私边界的统一框架。我们讨论了这一差分隐私定义在隐私深度学习、隐私凸优化、洗牌机制和美国人口普查数据中的应用，以突出与现有替代方法相比，在这一框架下分析隐私边界的好处。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Statistical Viewpoint on Differential Privacy: Hypothesis Testing, Representation and Blackwell's Theorem

Differential privacy is widely considered the formal privacy for privacy-preserving data analysis due to its robust and rigorous guarantees, with increasingly broad adoption in public services, academia, and industry. Despite originating in the cryptographic context, in this review paper we argue that, fundamentally, differential privacy can be considered a \textit{pure} statistical concept. By leveraging a theorem due to David Blackwell, our focus is to demonstrate that the definition of differential privacy can be formally motivated from a hypothesis testing perspective, thereby showing that hypothesis testing is not merely convenient but also the right language for reasoning about differential privacy. This insight leads to the definition of $f$-differential privacy, which extends other differential privacy definitions through a representation theorem. We review techniques that render $f$-differential privacy a unified framework for analyzing privacy bounds in data analysis and machine learning. Applications of this differential privacy definition to private deep learning, private convex optimization, shuffled mechanisms, and U.S.~Census data are discussed to highlight the benefits of analyzing privacy bounds under this framework compared to existing alternatives.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - STAT - Statistics Theory

自引率

0.00%

发文量