Differentially Private Real-Time Release of Sequential Data

IF 2.8 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Privacy and Security Pub Date : 2022-11-07 DOI:https://dl.acm.org/doi/10.1145/3544837

Xueru Zhang, Mohammad Mahdi Khalili, Mingyan Liu

{"title":"Differentially Private Real-Time Release of Sequential Data","authors":"Xueru Zhang, Mohammad Mahdi Khalili, Mingyan Liu","doi":"https://dl.acm.org/doi/10.1145/3544837","DOIUrl":null,"url":null,"abstract":"<p>Many data analytics applications rely on temporal data, generated (and possibly acquired) sequentially for online analysis. How to release this type of data in a privacy-preserving manner is of great interest and more challenging than releasing one-time, static data. Because of the (potentially strong) temporal correlation within the data sequence, the overall privacy loss can accumulate significantly over time; an attacker with statistical knowledge of the correlation can be particularly hard to defend against. An idea that has been explored in the literature to mitigate this problem is to factor this correlation into the perturbation/noise mechanism. Existing work, however, either focuses on the offline setting (where perturbation is designed and introduced after the entire sequence has become available), or requires <i>a priori</i> information on the correlation in generating perturbation. In this study we propose an approach where the correlation is learned as the sequence is generated, and is used for estimating future data in the sequence. This estimate then drives the generation of the noisy released data. This method allows us to design better perturbation and is suitable for real-time operations. Using the notion of differential privacy, we show this approach achieves high accuracy with lower privacy loss compared to existing methods.</p>","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":"191 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Privacy and Security","FirstCategoryId":"94","ListUrlMain":"https://doi.org/https://dl.acm.org/doi/10.1145/3544837","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Many data analytics applications rely on temporal data, generated (and possibly acquired) sequentially for online analysis. How to release this type of data in a privacy-preserving manner is of great interest and more challenging than releasing one-time, static data. Because of the (potentially strong) temporal correlation within the data sequence, the overall privacy loss can accumulate significantly over time; an attacker with statistical knowledge of the correlation can be particularly hard to defend against. An idea that has been explored in the literature to mitigate this problem is to factor this correlation into the perturbation/noise mechanism. Existing work, however, either focuses on the offline setting (where perturbation is designed and introduced after the entire sequence has become available), or requires a priori information on the correlation in generating perturbation. In this study we propose an approach where the correlation is learned as the sequence is generated, and is used for estimating future data in the sequence. This estimate then drives the generation of the noisy released data. This method allows us to design better perturbation and is suitable for real-time operations. Using the notion of differential privacy, we show this approach achieves high accuracy with lower privacy loss compared to existing methods.

查看原文本刊更多论文

差分私有串行数据实时释放

许多数据分析应用程序依赖于时序数据，这些数据是为了在线分析而顺序生成的(也可能是获取的)。如何以保护隐私的方式发布这类数据非常有趣，而且比发布一次性静态数据更具挑战性。由于数据序列中的时间相关性(可能很强)，随着时间的推移，整体隐私损失可能会显著累积;具有相关统计知识的攻击者尤其难以防御。为了缓解这一问题，文献中已经探索了一个想法，即将这种相关性纳入扰动/噪声机制。然而，现有的工作要么关注离线设置(在整个序列变得可用之后设计和引入扰动)，要么需要关于产生扰动的相关性的先验信息。在本研究中，我们提出了一种方法，其中相关性是在序列生成时学习的，并用于估计序列中的未来数据。然后，这个估计驱动了噪声释放数据的生成。这种方法使我们能够设计出更好的摄动，并且适合于实时操作。利用差分隐私的概念，我们证明了与现有方法相比，该方法具有较高的准确性和较低的隐私损失。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Privacy and Security Computer Science-General Computer Science

CiteScore

5.20

自引率

0.00%

发文量

期刊介绍： ACM Transactions on Privacy and Security (TOPS) (formerly known as TISSEC) publishes high-quality research results in the fields of information and system security and privacy. Studies addressing all aspects of these fields are welcomed, ranging from technologies, to systems and applications, to the crafting of policies.