数据分析平台基于风险的再识别匿名化框架

H. Silva, Tânia Basso, Regina L. O. Moraes, D. Elia, S. Fiore
{"title":"数据分析平台基于风险的再识别匿名化框架","authors":"H. Silva, Tânia Basso, Regina L. O. Moraes, D. Elia, S. Fiore","doi":"10.1109/EDCC.2018.00026","DOIUrl":null,"url":null,"abstract":"Preserving individual privacy is one of the major issues in the context of Big Data, since handling huge volumes of data may contribute to the disclosure of sensitive or personally identifiable information. In fact, even when data is anonymized there is a risk of re-identification through privacy attacks. This paper presents a re-identification risk-based anonymization framework for big data analytics platforms. This framework is based on anonymization policies and allows applying anonymization techniques and models in two stages: during the ETL process and before exporting the statistical results of data analytics. This second stage evaluates the data re-identification risk and increases the anonymity level if it is necessary to reduce this risk. Although generic, the implementation of the framework reported in this work was integrated into Ophidia as a case study. Privacy attacks were performed to check the effectiveness of the re-identification process. Results are promising, showing a low probability of re-identification in two different scenarios.","PeriodicalId":129399,"journal":{"name":"2018 14th European Dependable Computing Conference (EDCC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A Re-Identification Risk-Based Anonymization Framework for Data Analytics Platforms\",\"authors\":\"H. Silva, Tânia Basso, Regina L. O. Moraes, D. Elia, S. Fiore\",\"doi\":\"10.1109/EDCC.2018.00026\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Preserving individual privacy is one of the major issues in the context of Big Data, since handling huge volumes of data may contribute to the disclosure of sensitive or personally identifiable information. In fact, even when data is anonymized there is a risk of re-identification through privacy attacks. This paper presents a re-identification risk-based anonymization framework for big data analytics platforms. This framework is based on anonymization policies and allows applying anonymization techniques and models in two stages: during the ETL process and before exporting the statistical results of data analytics. This second stage evaluates the data re-identification risk and increases the anonymity level if it is necessary to reduce this risk. Although generic, the implementation of the framework reported in this work was integrated into Ophidia as a case study. Privacy attacks were performed to check the effectiveness of the re-identification process. Results are promising, showing a low probability of re-identification in two different scenarios.\",\"PeriodicalId\":129399,\"journal\":{\"name\":\"2018 14th European Dependable Computing Conference (EDCC)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 14th European Dependable Computing Conference (EDCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EDCC.2018.00026\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 14th European Dependable Computing Conference (EDCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EDCC.2018.00026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

保护个人隐私是大数据背景下的主要问题之一,因为处理大量数据可能会导致敏感或个人身份信息的泄露。事实上,即使数据是匿名的,也存在通过隐私攻击重新识别的风险。提出了一种用于大数据分析平台的基于风险的再识别匿名化框架。该框架基于匿名化策略,允许在两个阶段应用匿名化技术和模型:在ETL过程中和导出数据分析的统计结果之前。第二阶段评估数据重新识别风险,如果有必要降低这种风险,则增加匿名级别。虽然是通用的,但在这项工作中报告的框架的实施被整合到Ophidia作为一个案例研究。执行隐私攻击以检查重新识别过程的有效性。结果是有希望的,显示在两种不同的情况下重新识别的可能性很低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Re-Identification Risk-Based Anonymization Framework for Data Analytics Platforms
Preserving individual privacy is one of the major issues in the context of Big Data, since handling huge volumes of data may contribute to the disclosure of sensitive or personally identifiable information. In fact, even when data is anonymized there is a risk of re-identification through privacy attacks. This paper presents a re-identification risk-based anonymization framework for big data analytics platforms. This framework is based on anonymization policies and allows applying anonymization techniques and models in two stages: during the ETL process and before exporting the statistical results of data analytics. This second stage evaluates the data re-identification risk and increases the anonymity level if it is necessary to reduce this risk. Although generic, the implementation of the framework reported in this work was integrated into Ophidia as a case study. Privacy attacks were performed to check the effectiveness of the re-identification process. Results are promising, showing a low probability of re-identification in two different scenarios.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信