Micro-aggregation-based heuristics for p-sensitive k-anonymity: one step beyond

A. Solanas, F. Sebé, J. Domingo-Ferrer
{"title":"Micro-aggregation-based heuristics for p-sensitive k-anonymity: one step beyond","authors":"A. Solanas, F. Sebé, J. Domingo-Ferrer","doi":"10.1145/1379287.1379300","DOIUrl":null,"url":null,"abstract":"Micro-data protection is a hot topic in the field of Statistical Disclosure Control (SDC), that has gained special interest after the disclosure of 658000 queries by the AOL search engine in August 2006. Many algorithms, methods and properties have been proposed to deal with micro-data disclosure, p-Sensitive k-anonymity has been recently defined as a sophistication of k-anonymity. This new property requires that there be at least p different values for each confidential attribute within the records sharing a combination of key attributes. Like k-anonymity, the algorithm originally proposed to achieve this property was based on generalisations and suppressions; when data sets are numerical this has several data utility problems, namely turning numerical key attributes into categorical, injecting new categories, injecting missing data, and so on. In this article, we recall the foundational concepts of micro-aggregation, k-anonymity and p-sensitive k-anonymity. We show that k-anonymity and p-sensitive k-anonymity can be achieved in numerical data sets by means of micro-aggregation heuristics properly adapted to deal with this task. In addition, we present and evaluate two heuristics for p-sensitive k-anonymity which, being based on micro-aggregation, overcome most of the drawbacks resulting from the generalisation and suppression method.","PeriodicalId":245552,"journal":{"name":"International Conference on Pattern Analysis and Intelligent Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"51","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Pattern Analysis and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1379287.1379300","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 51

Abstract

Micro-data protection is a hot topic in the field of Statistical Disclosure Control (SDC), that has gained special interest after the disclosure of 658000 queries by the AOL search engine in August 2006. Many algorithms, methods and properties have been proposed to deal with micro-data disclosure, p-Sensitive k-anonymity has been recently defined as a sophistication of k-anonymity. This new property requires that there be at least p different values for each confidential attribute within the records sharing a combination of key attributes. Like k-anonymity, the algorithm originally proposed to achieve this property was based on generalisations and suppressions; when data sets are numerical this has several data utility problems, namely turning numerical key attributes into categorical, injecting new categories, injecting missing data, and so on. In this article, we recall the foundational concepts of micro-aggregation, k-anonymity and p-sensitive k-anonymity. We show that k-anonymity and p-sensitive k-anonymity can be achieved in numerical data sets by means of micro-aggregation heuristics properly adapted to deal with this task. In addition, we present and evaluate two heuristics for p-sensitive k-anonymity which, being based on micro-aggregation, overcome most of the drawbacks resulting from the generalisation and suppression method.
基于微聚合的p敏感k匿名启发式:更进一步
微数据保护是统计披露控制(SDC)领域的一个热门话题,在2006年8月美国在线搜索引擎(AOL)披露了658000条查询信息后,引起了人们的特别关注。人们提出了许多处理微数据披露的算法、方法和特性,p敏感k-匿名最近被定义为k-匿名的一种复杂形式。这个新属性要求共享键属性组合的记录中的每个机密属性至少有p个不同的值。与k-匿名一样,最初提出的实现这一特性的算法是基于泛化和抑制的;当数据集是数字的时候,这有几个数据实用问题,即将数字键属性转换为分类属性、注入新类别、注入缺失的数据等等。本文回顾了微聚集、k-匿名和p敏感k-匿名的基本概念。我们证明了k-匿名和p敏感k-匿名可以在数值数据集上通过适当适应的微聚集启发式方法来实现。此外,我们提出并评估了两种基于微聚集的p敏感k匿名启发式方法,它们克服了泛化和抑制方法造成的大多数缺点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信