Mingqiang Xue, Panagiotis Karras, Chedy Raïssi, H. Pung
{"title":"Utility-driven anonymization in data publishing","authors":"Mingqiang Xue, Panagiotis Karras, Chedy Raïssi, H. Pung","doi":"10.1145/2063576.2063945","DOIUrl":null,"url":null,"abstract":"Privacy-preserving data publication has been studied intensely in the past years. Still, all existing approaches transform data values by random perturbation or generalization. In this paper, we introduce a radically different data anonymization methodology. Our proposal aims to maintain a certain amount of patterns, defined in terms of a set of properties of interest that hold for the original data. Such properties are represented as linear relationships among data points. We present an algorithm that generates a set of anonymized data that strictly preserves these properties, thus maintaining specified patterns in the data. Extensive experiments with real and synthetic data show that our algorithm is efficient, and produces anonymized data that affords high utility in several data analysis tasks while safeguarding privacy.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"1 1","pages":"2277-2280"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2063576.2063945","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Privacy-preserving data publication has been studied intensely in the past years. Still, all existing approaches transform data values by random perturbation or generalization. In this paper, we introduce a radically different data anonymization methodology. Our proposal aims to maintain a certain amount of patterns, defined in terms of a set of properties of interest that hold for the original data. Such properties are represented as linear relationships among data points. We present an algorithm that generates a set of anonymized data that strictly preserves these properties, thus maintaining specified patterns in the data. Extensive experiments with real and synthetic data show that our algorithm is efficient, and produces anonymized data that affords high utility in several data analysis tasks while safeguarding privacy.