Nebula: Efficient, Private and Accurate Histogram Estimation

arXiv - CS - Cryptography and Security Pub Date : 2024-09-15 DOI:arxiv-2409.09676

Ali Shahin Shamsabadi, Peter Snyder, Ralph Giles, Aurélien Bellet, Hamed Haddadi

{"title":"Nebula: Efficient, Private and Accurate Histogram Estimation","authors":"Ali Shahin Shamsabadi, Peter Snyder, Ralph Giles, Aurélien Bellet, Hamed Haddadi","doi":"arxiv-2409.09676","DOIUrl":null,"url":null,"abstract":"We present Nebula, a system for differential private histogram estimation of\ndata distributed among clients. Nebula enables clients to locally subsample and\nencode their data such that an untrusted server learns only data values that\nmeet an aggregation threshold to satisfy differential privacy guarantees.\nCompared with other private histogram estimation systems, Nebula uniquely\nachieves all of the following: \\textit{i)} a strict upper bound on privacy\nleakage; \\textit{ii)} client privacy under realistic trust assumptions;\n\\textit{iii)} significantly better utility compared to standard local\ndifferential privacy systems; and \\textit{iv)} avoiding trusted third-parties,\nmulti-party computation, or trusted hardware. We provide both a formal\nevaluation of Nebula's privacy, utility and efficiency guarantees, along with\nan empirical evaluation on three real-world datasets. We demonstrate that\nclients can encode and upload their data efficiently (only 0.0058 seconds\nrunning time and 0.0027 MB data communication) and privately (strong\ndifferential privacy guarantees $\\varepsilon=1$). On the United States Census\ndataset, the Nebula's untrusted aggregation server estimates histograms with\nabove 88\\% better utility than the existing local deployment of differential\nprivacy. Additionally, we describe a variant that allows clients to submit\nmulti-dimensional data, with similar privacy, utility, and performance.\nFinally, we provide an open source implementation of Nebula.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"19 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Cryptography and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09676","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We present Nebula, a system for differential private histogram estimation of data distributed among clients. Nebula enables clients to locally subsample and encode their data such that an untrusted server learns only data values that meet an aggregation threshold to satisfy differential privacy guarantees. Compared with other private histogram estimation systems, Nebula uniquely achieves all of the following: \textit{i)} a strict upper bound on privacy leakage; \textit{ii)} client privacy under realistic trust assumptions; \textit{iii)} significantly better utility compared to standard local differential privacy systems; and \textit{iv)} avoiding trusted third-parties, multi-party computation, or trusted hardware. We provide both a formal evaluation of Nebula's privacy, utility and efficiency guarantees, along with an empirical evaluation on three real-world datasets. We demonstrate that clients can encode and upload their data efficiently (only 0.0058 seconds running time and 0.0027 MB data communication) and privately (strong differential privacy guarantees $\varepsilon=1$). On the United States Census dataset, the Nebula's untrusted aggregation server estimates histograms with above 88\% better utility than the existing local deployment of differential privacy. Additionally, we describe a variant that allows clients to submit multi-dimensional data, with similar privacy, utility, and performance. Finally, we provide an open source implementation of Nebula.

查看原文本刊更多论文

星云高效、私密、精确的直方图估算

我们介绍的 Nebula 是一种对分布在客户端的数据进行差分隐私直方图估算的系统。Nebula使客户端能够对其数据进行本地子采样和编码，这样，不受信任的服务器只能了解满足聚合阈值的数据值，从而满足差分隐私保证：\textit{i)}隐私泄露的严格上限；\textit{ii)}现实信任假设下的客户端隐私；\textit{iii)}与标准的局部差分隐私系统相比，效用明显更好；\textit{iv)}避免了可信第三方、多方计算或可信硬件。我们对星云的隐私、效用和效率保证进行了形式评估，并在三个真实数据集上进行了实证评估。我们证明，客户可以高效地编码和上传他们的数据（运行时间仅为0.0058秒，数据通信量为0.0027 MB），并且是私密的（强差分隐私保证 $/varepsilon=1$）。在美国人口普查数据集上，星云的非信任聚合服务器估算出的直方图的实用性比现有的本地差分隐私部署高出88%。此外，我们还描述了一种允许客户端提交多维数据的变体，其隐私性、实用性和性能都与之类似。最后，我们提供了Nebula的开源实现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Cryptography and Security

自引率

0.00%

发文量