匿名化的概率展望:主题演讲

International Conference on Pattern Analysis and Intelligent Systems Pub Date : 2011-03-25 DOI:10.1145/1971690.1971691

Y. Saygin

{"title":"匿名化的概率展望:主题演讲","authors":"Y. Saygin","doi":"10.1145/1971690.1971691","DOIUrl":null,"url":null,"abstract":"Data anonymization is an expensive process, and sometimes the utility of the anonymized data may not justify the cost of anonymization. For example in a distributed setting where the data reside at different sites and needs to be anonymized without a trusted server, Secure Multiparty Computation (SMC) protocols need to be employed. However, the cost of SMC protocols could be prohibitive, and therefore the parties may want to look ahead of anonymization to decide if it is worth running the expensive SMC protocols. In this work, we describe a probabilistic fast look ahead of k-anonymization of horizontally partitioned data. The look ahead returns an upper bound on the probability that k-anonymity will be achieved at a certain utility where the utility is quantified by commonly used metrics from the anonymization literature. The look ahead process exploits prior information such as total data size, attribute distributions, or attribute correlations, all of which require simple SMC operations to compute. More specifically, given only statistics on the private dataset, we show how to calculate the probability that a mapping of values to generalizations will make a private dataset k-anonymous.","PeriodicalId":245552,"journal":{"name":"International Conference on Pattern Analysis and Intelligent Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A probabilistic look ahead of anonymization: keynote talk\",\"authors\":\"Y. Saygin\",\"doi\":\"10.1145/1971690.1971691\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data anonymization is an expensive process, and sometimes the utility of the anonymized data may not justify the cost of anonymization. For example in a distributed setting where the data reside at different sites and needs to be anonymized without a trusted server, Secure Multiparty Computation (SMC) protocols need to be employed. However, the cost of SMC protocols could be prohibitive, and therefore the parties may want to look ahead of anonymization to decide if it is worth running the expensive SMC protocols. In this work, we describe a probabilistic fast look ahead of k-anonymization of horizontally partitioned data. The look ahead returns an upper bound on the probability that k-anonymity will be achieved at a certain utility where the utility is quantified by commonly used metrics from the anonymization literature. The look ahead process exploits prior information such as total data size, attribute distributions, or attribute correlations, all of which require simple SMC operations to compute. More specifically, given only statistics on the private dataset, we show how to calculate the probability that a mapping of values to generalizations will make a private dataset k-anonymous.\",\"PeriodicalId\":245552,\"journal\":{\"name\":\"International Conference on Pattern Analysis and Intelligent Systems\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Pattern Analysis and Intelligent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1971690.1971691\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Pattern Analysis and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1971690.1971691","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

数据匿名化是一个昂贵的过程，有时匿名数据的效用可能无法证明匿名化的成本是合理的。例如，在分布式设置中，数据驻留在不同的站点，并且需要在没有可信服务器的情况下进行匿名化，则需要使用安全多方计算(SMC)协议。然而，SMC协议的成本可能令人望而却步，因此各方可能希望在匿名化之前考虑是否值得运行昂贵的SMC协议。在这项工作中，我们描述了水平分区数据的k-匿名化的概率快速预测。前瞻返回k-匿名将在某个效用上实现的概率的上界，该效用是通过匿名化文献中的常用指标量化的。前瞻性过程利用诸如总数据大小、属性分布或属性相关性等先验信息，所有这些都需要简单的SMC操作来计算。更具体地说，只给出私有数据集的统计数据，我们展示了如何计算将值映射到泛化将使私有数据集k-anonymous的概率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A probabilistic look ahead of anonymization: keynote talk

Data anonymization is an expensive process, and sometimes the utility of the anonymized data may not justify the cost of anonymization. For example in a distributed setting where the data reside at different sites and needs to be anonymized without a trusted server, Secure Multiparty Computation (SMC) protocols need to be employed. However, the cost of SMC protocols could be prohibitive, and therefore the parties may want to look ahead of anonymization to decide if it is worth running the expensive SMC protocols. In this work, we describe a probabilistic fast look ahead of k-anonymization of horizontally partitioned data. The look ahead returns an upper bound on the probability that k-anonymity will be achieved at a certain utility where the utility is quantified by commonly used metrics from the anonymization literature. The look ahead process exploits prior information such as total data size, attribute distributions, or attribute correlations, all of which require simple SMC operations to compute. More specifically, given only statistics on the private dataset, we show how to calculate the probability that a mapping of values to generalizations will make a private dataset k-anonymous.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Conference on Pattern Analysis and Intelligent Systems

自引率

0.00%

发文量