dinur-nissim算法的威力:打破统计和图形数据库的隐私

K. Choromanski, T. Malkin
{"title":"dinur-nissim算法的威力:打破统计和图形数据库的隐私","authors":"K. Choromanski, T. Malkin","doi":"10.1145/2213556.2213570","DOIUrl":null,"url":null,"abstract":"A few years ago, Dinur and Nissim (PODS, 2003) proposed an algorithm for breaking database privacy when statistical queries are answered with a perturbation error of magnitude o(√n) for a database of size n. This negative result is very strong in the sense that it completely reconstructs Ω(n) data bits with an algorithm that is simple, uses random queries, and does not put any restriction on the perturbation other than its magnitude. Their algorithm works for a model where the database consists of bits, and the statistical queries asked by the adversary are sum queries for a subset of locations.\n In this paper we extend the attack to work for much more general settings in terms of the type of statistical query allowed, the database domain, and the general tradeoff between perturbation and privacy. Specifically, we prove: For queries of the type ∑in=1 φixi; where φ_{i} are i.i.d. and with a finite third moment and positive variance (this includes as a special case the sum queries of Dinur-Nissim and several subsequent extensions), we prove that the quadratic relation between the perturbation and what the adversary can reconstruct holds even for smaller perturbations, and even for a larger data domain. If φi is Gaussian, Poissonian, or bounded and of positive variance, this holds for arbitrary data domains and perturbation; for other φi this holds as long as the domain is not too large and the perturbation is not too small. A positive result showing that for a sum query the negative result mentioned above is tight. Specifically, we build a distribution on bit databases and an answering algorithm such that any adversary who wants to recover a little more than the negative result above allows, will not succeed except with negligible probability. We consider a richer class of summation queries, focusing on databases representing graphs, where each entry is an edge, and the query is a structural function of a subgraph. We show an attack that recovers a big portion of the graph edges, as long as the graph and the function satisfy certain properties.\n The attacking algorithms in both our negative results are straight-forward extensions of the Dinur-Nissim attack, based on asking φ-weighted queries or queries choosing a subgraph uniformly at random. The novelty of our work is in the analysis, showing that this simple attack is much more powerful than was previously known, as well as pointing to possible limits of this approach and putting forth new application domains such as graph problems (which may occur in social networks, Internet graphs, etc). These results may find applications not only for breaking privacy, but also in the positive direction, for recovering complicated structure information using inaccurate estimates about its substructures.","PeriodicalId":92118,"journal":{"name":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","volume":"25 1","pages":"65-76"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"The power of the dinur-nissim algorithm: breaking privacy of statistical and graph databases\",\"authors\":\"K. Choromanski, T. Malkin\",\"doi\":\"10.1145/2213556.2213570\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A few years ago, Dinur and Nissim (PODS, 2003) proposed an algorithm for breaking database privacy when statistical queries are answered with a perturbation error of magnitude o(√n) for a database of size n. This negative result is very strong in the sense that it completely reconstructs Ω(n) data bits with an algorithm that is simple, uses random queries, and does not put any restriction on the perturbation other than its magnitude. Their algorithm works for a model where the database consists of bits, and the statistical queries asked by the adversary are sum queries for a subset of locations.\\n In this paper we extend the attack to work for much more general settings in terms of the type of statistical query allowed, the database domain, and the general tradeoff between perturbation and privacy. Specifically, we prove: For queries of the type ∑in=1 φixi; where φ_{i} are i.i.d. and with a finite third moment and positive variance (this includes as a special case the sum queries of Dinur-Nissim and several subsequent extensions), we prove that the quadratic relation between the perturbation and what the adversary can reconstruct holds even for smaller perturbations, and even for a larger data domain. If φi is Gaussian, Poissonian, or bounded and of positive variance, this holds for arbitrary data domains and perturbation; for other φi this holds as long as the domain is not too large and the perturbation is not too small. A positive result showing that for a sum query the negative result mentioned above is tight. Specifically, we build a distribution on bit databases and an answering algorithm such that any adversary who wants to recover a little more than the negative result above allows, will not succeed except with negligible probability. We consider a richer class of summation queries, focusing on databases representing graphs, where each entry is an edge, and the query is a structural function of a subgraph. We show an attack that recovers a big portion of the graph edges, as long as the graph and the function satisfy certain properties.\\n The attacking algorithms in both our negative results are straight-forward extensions of the Dinur-Nissim attack, based on asking φ-weighted queries or queries choosing a subgraph uniformly at random. The novelty of our work is in the analysis, showing that this simple attack is much more powerful than was previously known, as well as pointing to possible limits of this approach and putting forth new application domains such as graph problems (which may occur in social networks, Internet graphs, etc). These results may find applications not only for breaking privacy, but also in the positive direction, for recovering complicated structure information using inaccurate estimates about its substructures.\",\"PeriodicalId\":92118,\"journal\":{\"name\":\"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems\",\"volume\":\"25 1\",\"pages\":\"65-76\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2213556.2213570\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2213556.2213570","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

几年前,Dinur和Nissim (PODS, 2003)提出了一种算法,当对大小为n的数据库回答统计查询时,会产生0(√n)量级的扰动误差,从而破坏数据库隐私。这个负面结果非常强,因为它用一种简单的算法完全重构了Ω(n)个数据位,使用随机查询,除了扰动的大小外,没有对扰动施加任何限制。他们的算法适用于数据库由位组成的模型,攻击者请求的统计查询是对位置子集的求和查询。在本文中,我们将攻击扩展到更一般的设置,包括允许的统计查询类型,数据库域以及扰动和隐私之间的一般权衡。具体来说,我们证明了:对于∑in=1 φixi型的查询;其中φ_{i}为i.i.d,并且具有有限的第三矩和正方差(这包括作为特殊情况的Dinur-Nissim和几个后续扩展的求和查询),我们证明了即使对于较小的扰动,甚至对于较大的数据域,摄动和对手可以重建的东西之间的二次关系也成立。如果φi是高斯型、泊松型或有界型且方差为正,则适用于任意数据域和扰动;对于其他φi,只要定义域不太大,扰动不太小,这种情况就成立。正结果表明,对于求和查询,上面提到的负结果是紧的。具体来说,我们在位数据库和应答算法上构建了一个分布,这样任何想要恢复比上述负面结果多一点的对手都不会成功,除非概率可以忽略不计。我们考虑更丰富的求和查询类,重点关注表示图的数据库,其中每个条目是一条边,查询是子图的结构函数。我们展示了一种攻击,只要图和函数满足某些性质,就可以恢复图的大部分边。我们的两个否定结果中的攻击算法都是Dinur-Nissim攻击的直接扩展,基于请求φ加权查询或随机均匀选择子图的查询。我们工作的新颖之处在于分析,表明这种简单的攻击比以前已知的要强大得多,同时指出了这种方法的可能局限性,并提出了新的应用领域,如图问题(可能发生在社交网络,互联网图等)。这些结果不仅可以用于破坏隐私,而且可以在积极的方向上使用对其子结构的不准确估计来恢复复杂的结构信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The power of the dinur-nissim algorithm: breaking privacy of statistical and graph databases
A few years ago, Dinur and Nissim (PODS, 2003) proposed an algorithm for breaking database privacy when statistical queries are answered with a perturbation error of magnitude o(√n) for a database of size n. This negative result is very strong in the sense that it completely reconstructs Ω(n) data bits with an algorithm that is simple, uses random queries, and does not put any restriction on the perturbation other than its magnitude. Their algorithm works for a model where the database consists of bits, and the statistical queries asked by the adversary are sum queries for a subset of locations. In this paper we extend the attack to work for much more general settings in terms of the type of statistical query allowed, the database domain, and the general tradeoff between perturbation and privacy. Specifically, we prove: For queries of the type ∑in=1 φixi; where φ_{i} are i.i.d. and with a finite third moment and positive variance (this includes as a special case the sum queries of Dinur-Nissim and several subsequent extensions), we prove that the quadratic relation between the perturbation and what the adversary can reconstruct holds even for smaller perturbations, and even for a larger data domain. If φi is Gaussian, Poissonian, or bounded and of positive variance, this holds for arbitrary data domains and perturbation; for other φi this holds as long as the domain is not too large and the perturbation is not too small. A positive result showing that for a sum query the negative result mentioned above is tight. Specifically, we build a distribution on bit databases and an answering algorithm such that any adversary who wants to recover a little more than the negative result above allows, will not succeed except with negligible probability. We consider a richer class of summation queries, focusing on databases representing graphs, where each entry is an edge, and the query is a structural function of a subgraph. We show an attack that recovers a big portion of the graph edges, as long as the graph and the function satisfy certain properties. The attacking algorithms in both our negative results are straight-forward extensions of the Dinur-Nissim attack, based on asking φ-weighted queries or queries choosing a subgraph uniformly at random. The novelty of our work is in the analysis, showing that this simple attack is much more powerful than was previously known, as well as pointing to possible limits of this approach and putting forth new application domains such as graph problems (which may occur in social networks, Internet graphs, etc). These results may find applications not only for breaking privacy, but also in the positive direction, for recovering complicated structure information using inaccurate estimates about its substructures.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.40
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信