{"title":"集值数据的部分抑制个性化匿名化","authors":"Takuma Nakagawa, Hiromi Arai, Hiroshi Nakagawa","doi":"10.1109/ICDMW.2017.142","DOIUrl":null,"url":null,"abstract":"Set-valued data is comprised of records that are sets of items, such as goods purchased by each individual. Methods of publishing and widely utilizing set-valued data while protecting personal information have been extensively studied in the field of privacy-preserving data publishing. Until now, basic models such as k-anonymity or km-anonymity could not cope with attribute inference by an adversary with background knowledge of the records. On the other hand, the ρ-uncertainty model makes it possible to prevent attribute inference with a confidence value above a certain level in set-valued data. However, even in that case, there is the problem that items to be protected have to be designated in advance. In this research, we propose a new model that can provide more suitable privacy protection for each individual by protecting different items designated for each record distinctively and build a heuristic algorithm to achieve this guarantee using partial suppression. In addition, considering the problem that the computational complexity of the algorithm increases combinatorially with increasing data size, we introduce the concept of probabilistic relaxation of privacy guarantee. Finally, we show the experimental results of evaluating the performance of the algorithms using real-world datasets.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Personalized Anonymization for Set-Valued Data by Partial Suppression\",\"authors\":\"Takuma Nakagawa, Hiromi Arai, Hiroshi Nakagawa\",\"doi\":\"10.1109/ICDMW.2017.142\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Set-valued data is comprised of records that are sets of items, such as goods purchased by each individual. Methods of publishing and widely utilizing set-valued data while protecting personal information have been extensively studied in the field of privacy-preserving data publishing. Until now, basic models such as k-anonymity or km-anonymity could not cope with attribute inference by an adversary with background knowledge of the records. On the other hand, the ρ-uncertainty model makes it possible to prevent attribute inference with a confidence value above a certain level in set-valued data. However, even in that case, there is the problem that items to be protected have to be designated in advance. In this research, we propose a new model that can provide more suitable privacy protection for each individual by protecting different items designated for each record distinctively and build a heuristic algorithm to achieve this guarantee using partial suppression. In addition, considering the problem that the computational complexity of the algorithm increases combinatorially with increasing data size, we introduce the concept of probabilistic relaxation of privacy guarantee. Finally, we show the experimental results of evaluating the performance of the algorithms using real-world datasets.\",\"PeriodicalId\":389183,\"journal\":{\"name\":\"2017 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"volume\":\"110 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW.2017.142\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2017.142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Personalized Anonymization for Set-Valued Data by Partial Suppression
Set-valued data is comprised of records that are sets of items, such as goods purchased by each individual. Methods of publishing and widely utilizing set-valued data while protecting personal information have been extensively studied in the field of privacy-preserving data publishing. Until now, basic models such as k-anonymity or km-anonymity could not cope with attribute inference by an adversary with background knowledge of the records. On the other hand, the ρ-uncertainty model makes it possible to prevent attribute inference with a confidence value above a certain level in set-valued data. However, even in that case, there is the problem that items to be protected have to be designated in advance. In this research, we propose a new model that can provide more suitable privacy protection for each individual by protecting different items designated for each record distinctively and build a heuristic algorithm to achieve this guarantee using partial suppression. In addition, considering the problem that the computational complexity of the algorithm increases combinatorially with increasing data size, we introduce the concept of probabilistic relaxation of privacy guarantee. Finally, we show the experimental results of evaluating the performance of the algorithms using real-world datasets.