Differential Private Data Stream Analytics in the Local and Shuffle Models

IF 9.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Shaowei Wang;Jin Li;Yun Peng;Kongyang Chen;Wei Yang;Hui Jiang;Jin Li
{"title":"Differential Private Data Stream Analytics in the Local and Shuffle Models","authors":"Shaowei Wang;Jin Li;Yun Peng;Kongyang Chen;Wei Yang;Hui Jiang;Jin Li","doi":"10.1109/TMC.2025.3559621","DOIUrl":null,"url":null,"abstract":"We study online data analytics with differential privacy (DP) in decentralized settings. Specifically, online data analytics with local DP protection is widely adopted in real-world applications. Despite numerous endeavors in this field, significant gaps in utility and functionality remain when compared to its offline counterpart. We present an optimal, streamable mechanism: <monospace>ExSub</monospace>, for local DP sparse vector estimation. The mechanism enables a range of online analytics on streaming binary vectors, including multi-dimensional binary, categorical, or set-valued data. By leveraging the negative correlation of occurrence events in the sparse vector, we attain an optimal error rate under local privacy constraints, only requiring streamable computations. To surpass the error barrier of local privacy, we also study <monospace>ExSub</monospace> randomizer in the newly emerging (single-message) shuffle model of DP, and provide nearly-tight privacy amplification bounds therein. Additionally, we leverage the online shuffle model that independently permutes users’ messages at each timestamp, to design a simplified randomization strategy that can approximately reach Gaussian accuracy in central DP. Through experiments with both synthetic and real-world datasets, <monospace>ExSub</monospace> mechanism in the local model have been shown to reduce error by 40%–60% compared to SOTA approaches. The <monospace>ExSub</monospace> in the shuffle model can further reduce over 85% error, and the online shuffle protocol reduces over 99.7% error.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 7","pages":"6701-6717"},"PeriodicalIF":9.2000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10964077/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

We study online data analytics with differential privacy (DP) in decentralized settings. Specifically, online data analytics with local DP protection is widely adopted in real-world applications. Despite numerous endeavors in this field, significant gaps in utility and functionality remain when compared to its offline counterpart. We present an optimal, streamable mechanism: ExSub, for local DP sparse vector estimation. The mechanism enables a range of online analytics on streaming binary vectors, including multi-dimensional binary, categorical, or set-valued data. By leveraging the negative correlation of occurrence events in the sparse vector, we attain an optimal error rate under local privacy constraints, only requiring streamable computations. To surpass the error barrier of local privacy, we also study ExSub randomizer in the newly emerging (single-message) shuffle model of DP, and provide nearly-tight privacy amplification bounds therein. Additionally, we leverage the online shuffle model that independently permutes users’ messages at each timestamp, to design a simplified randomization strategy that can approximately reach Gaussian accuracy in central DP. Through experiments with both synthetic and real-world datasets, ExSub mechanism in the local model have been shown to reduce error by 40%–60% compared to SOTA approaches. The ExSub in the shuffle model can further reduce over 85% error, and the online shuffle protocol reduces over 99.7% error.
本地和Shuffle模型中的差异私有数据流分析
我们研究了分散环境下差分隐私(DP)的在线数据分析。具体来说,具有本地DP保护的在线数据分析在实际应用中被广泛采用。尽管在该领域进行了大量的努力,但与离线版本相比,在实用程序和功能方面仍然存在重大差距。我们提出了一种最优的、可流化的机制:ExSub,用于局部DP稀疏向量估计。该机制允许对流二进制向量进行一系列在线分析,包括多维二进制,分类或集值数据。通过利用稀疏向量中发生事件的负相关性,我们在局部隐私约束下获得了最优错误率,只需要可流计算。为了超越局部隐私的错误屏障,我们还研究了新出现的DP(单消息)shuffle模型中的ExSub随机器,并在其中提供了近乎严格的隐私放大边界。此外,我们利用在线洗牌模型,在每个时间戳独立排列用户的消息,设计一个简化的随机化策略,可以在中心DP近似达到高斯精度。通过对合成数据集和真实数据集的实验,与SOTA方法相比,ExSub机制在局部模型中的误差降低了40%-60%。shuffle模型中的ExSub可以进一步减少85%以上的错误,在线shuffle协议可以减少99.7%以上的错误。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Mobile Computing
IEEE Transactions on Mobile Computing 工程技术-电信学
CiteScore
12.90
自引率
2.50%
发文量
403
审稿时长
6.6 months
期刊介绍: IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信