分布式系统中的假名风险分析

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Internet Services and Applications Pub Date : 2019-01-08 DOI:10.1186/s13174-018-0098-z

Geoffrey K. Neumann, Paul Grace, Daniel Burns, Mike Surridge

{"title":"分布式系统中的假名风险分析","authors":"Geoffrey K. Neumann, Paul Grace, Daniel Burns, Mike Surridge","doi":"10.1186/s13174-018-0098-z","DOIUrl":null,"url":null,"abstract":"In an era of big data, online services are becoming increasingly data-centric; they collect, process, analyze and anonymously disclose growing amounts of personal data in the form of pseudonymized data sets. It is crucial that such systems are engineered to both protect individual user (data subject) privacy and give back control of personal data to the user. In terms of pseudonymized data this means that unwanted individuals should not be able to deduce sensitive information about the user. However, the plethora of pseudonymization algorithms and tuneable parameters that currently exist make it difficult for a non expert developer (data controller) to understand and realise strong privacy guarantees. In this paper we propose a principled Model-Driven Engineering (MDE) framework to model data services in terms of their pseudonymization strategies and identify the risks to breaches of user privacy. A developer can explore alternative pseudonymization strategies to determine the effectiveness of their pseudonymization strategy in terms of quantifiable metrics: i) violations of privacy requirements for every user in the current data set; ii) the trade-off between conforming to these requirements and the usefulness of the data for its intended purposes. We demonstrate through an experimental evaluation that the information provided by the framework is useful, particularly in complex situations where privacy requirements are different for different users, and can inform decisions to optimize a chosen strategy in comparison to applying an off-the-shelf algorithm.","PeriodicalId":46467,"journal":{"name":"Journal of Internet Services and Applications","volume":"55 1","pages":"1-16"},"PeriodicalIF":2.4000,"publicationDate":"2019-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Pseudonymization risk analysis in distributed systems\",\"authors\":\"Geoffrey K. Neumann, Paul Grace, Daniel Burns, Mike Surridge\",\"doi\":\"10.1186/s13174-018-0098-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In an era of big data, online services are becoming increasingly data-centric; they collect, process, analyze and anonymously disclose growing amounts of personal data in the form of pseudonymized data sets. It is crucial that such systems are engineered to both protect individual user (data subject) privacy and give back control of personal data to the user. In terms of pseudonymized data this means that unwanted individuals should not be able to deduce sensitive information about the user. However, the plethora of pseudonymization algorithms and tuneable parameters that currently exist make it difficult for a non expert developer (data controller) to understand and realise strong privacy guarantees. In this paper we propose a principled Model-Driven Engineering (MDE) framework to model data services in terms of their pseudonymization strategies and identify the risks to breaches of user privacy. A developer can explore alternative pseudonymization strategies to determine the effectiveness of their pseudonymization strategy in terms of quantifiable metrics: i) violations of privacy requirements for every user in the current data set; ii) the trade-off between conforming to these requirements and the usefulness of the data for its intended purposes. We demonstrate through an experimental evaluation that the information provided by the framework is useful, particularly in complex situations where privacy requirements are different for different users, and can inform decisions to optimize a chosen strategy in comparison to applying an off-the-shelf algorithm.\",\"PeriodicalId\":46467,\"journal\":{\"name\":\"Journal of Internet Services and Applications\",\"volume\":\"55 1\",\"pages\":\"1-16\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2019-01-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Internet Services and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s13174-018-0098-z\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Internet Services and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13174-018-0098-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 14

摘要

在大数据时代，在线服务越来越以数据为中心;他们以假名数据集的形式收集、处理、分析和匿名披露越来越多的个人数据。至关重要的是，这些系统的设计既要保护个人用户(数据主体)的隐私，又要把个人数据的控制权交还给用户。就假名化数据而言，这意味着不受欢迎的个人不应该能够推断出有关用户的敏感信息。然而，目前存在的过多的假名算法和可调参数使得非专业开发人员(数据控制器)难以理解和实现强大的隐私保证。在本文中，我们提出了一个有原则的模型驱动工程(MDE)框架，根据其假名化策略对数据服务进行建模，并识别侵犯用户隐私的风险。开发者可以探索其他的假名策略，以确定他们的假名策略的有效性，根据可量化的指标:i)违反当前数据集中每个用户的隐私要求;Ii)在符合这些要求和数据对其预期用途的有用性之间进行权衡。我们通过实验评估证明，框架提供的信息是有用的，特别是在不同用户的隐私要求不同的复杂情况下，与应用现成的算法相比，可以为优化所选策略的决策提供信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Pseudonymization risk analysis in distributed systems

In an era of big data, online services are becoming increasingly data-centric; they collect, process, analyze and anonymously disclose growing amounts of personal data in the form of pseudonymized data sets. It is crucial that such systems are engineered to both protect individual user (data subject) privacy and give back control of personal data to the user. In terms of pseudonymized data this means that unwanted individuals should not be able to deduce sensitive information about the user. However, the plethora of pseudonymization algorithms and tuneable parameters that currently exist make it difficult for a non expert developer (data controller) to understand and realise strong privacy guarantees. In this paper we propose a principled Model-Driven Engineering (MDE) framework to model data services in terms of their pseudonymization strategies and identify the risks to breaches of user privacy. A developer can explore alternative pseudonymization strategies to determine the effectiveness of their pseudonymization strategy in terms of quantifiable metrics: i) violations of privacy requirements for every user in the current data set; ii) the trade-off between conforming to these requirements and the usefulness of the data for its intended purposes. We demonstrate through an experimental evaluation that the information provided by the framework is useful, particularly in complex situations where privacy requirements are different for different users, and can inform decisions to optimize a chosen strategy in comparison to applying an off-the-shelf algorithm.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Internet Services and Applications COMPUTER SCIENCE, INFORMATION SYSTEMS-

CiteScore

3.70

自引率

0.00%

发文量

审稿时长

13 weeks