GeoRecover：从中毒攻击中恢复ldp支持的空间密度聚集

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-07-28 DOI:10.1109/TKDE.2025.3593289

Xinyue Sun;Qingqing Ye;Haibo Hu;Jiawei Duan;Hui He;Weizhe Zhang

{"title":"GeoRecover：从中毒攻击中恢复ldp支持的空间密度聚集","authors":"Xinyue Sun;Qingqing Ye;Haibo Hu;Jiawei Duan;Hui He;Weizhe Zhang","doi":"10.1109/TKDE.2025.3593289","DOIUrl":null,"url":null,"abstract":"The spatial density distribution collected and aggregated from users’ trajectory data is vital for location-based services like regional popularity analysis and congestion measurement. However, spatial density aggregation poses privacy concerns since trajectory data usually originate from users. Local differential privacy (LDP) addresses these concerns by allowing users to perturb their data before reporting it. Yet, LDP is vulnerable to poisoning attacks where attackers manipulate data from malicious users. Recent studies attempt to defend against such attacks in LDP-enabled frequency estimation but suffer from inaccurate data recovery due to empirical presets of malicious user proportions and inaccurate malicious data estimation. These issues worsen in spatial density aggregation, as high-dimensional trajectory data help conceal malicious information. In this work, we propose GeoRecover, a method to defend against poisoning attacks in LDP-enabled spatial density aggregation by addressing previous limitations. GeoRecover designs an adaptive model to unify these attacks. Under this model, GeoRecover estimates the proportion of malicious users using statistical differences between genuine and malicious data and learns malicious data statistics through LDP properties. This allows GeoRecover to recover accurate spatial density distribution by subtracting malicious users’ contributions. Evaluations on two real-world datasets show GeoRecover outperforms state-of-the-art methods in recovery accuracy, defense capability, and practical performance.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 10","pages":"5919-5933"},"PeriodicalIF":10.4000,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GeoRecover: Recovery From Poisoning Attacks for LDP-Enabled Spatial Density Aggregation\",\"authors\":\"Xinyue Sun;Qingqing Ye;Haibo Hu;Jiawei Duan;Hui He;Weizhe Zhang\",\"doi\":\"10.1109/TKDE.2025.3593289\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The spatial density distribution collected and aggregated from users’ trajectory data is vital for location-based services like regional popularity analysis and congestion measurement. However, spatial density aggregation poses privacy concerns since trajectory data usually originate from users. Local differential privacy (LDP) addresses these concerns by allowing users to perturb their data before reporting it. Yet, LDP is vulnerable to poisoning attacks where attackers manipulate data from malicious users. Recent studies attempt to defend against such attacks in LDP-enabled frequency estimation but suffer from inaccurate data recovery due to empirical presets of malicious user proportions and inaccurate malicious data estimation. These issues worsen in spatial density aggregation, as high-dimensional trajectory data help conceal malicious information. In this work, we propose GeoRecover, a method to defend against poisoning attacks in LDP-enabled spatial density aggregation by addressing previous limitations. GeoRecover designs an adaptive model to unify these attacks. Under this model, GeoRecover estimates the proportion of malicious users using statistical differences between genuine and malicious data and learns malicious data statistics through LDP properties. This allows GeoRecover to recover accurate spatial density distribution by subtracting malicious users’ contributions. Evaluations on two real-world datasets show GeoRecover outperforms state-of-the-art methods in recovery accuracy, defense capability, and practical performance.\",\"PeriodicalId\":13496,\"journal\":{\"name\":\"IEEE Transactions on Knowledge and Data Engineering\",\"volume\":\"37 10\",\"pages\":\"5919-5933\"},\"PeriodicalIF\":10.4000,\"publicationDate\":\"2025-07-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Knowledge and Data Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11098680/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11098680/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

从用户轨迹数据中收集和汇总的空间密度分布对于区域人气分析和拥堵测量等基于位置的服务至关重要。然而，由于轨迹数据通常来自用户，因此空间密度聚合会引起隐私问题。本地差异隐私（LDP）通过允许用户在报告数据之前干扰数据来解决这些问题。然而，LDP很容易受到攻击者操纵恶意用户的数据的中毒攻击。最近的研究试图在支持ldp的频率估计中防御此类攻击，但由于恶意用户比例的经验预设和不准确的恶意数据估计，导致数据恢复不准确。这些问题在空间密度聚集中更加严重，因为高维轨迹数据有助于隐藏恶意信息。在这项工作中，我们提出了GeoRecover，这是一种通过解决先前的限制来防御ldp支持的空间密度聚集中的中毒攻击的方法。GeoRecover设计了一个自适应模型来统一这些攻击。在该模型下，GeoRecover利用真实数据和恶意数据的统计差异估计恶意用户的比例，并通过LDP属性学习恶意数据的统计信息。这允许GeoRecover通过减去恶意用户的贡献来恢复准确的空间密度分布。对两个真实数据集的评估表明，GeoRecover在恢复精度、防御能力和实际性能方面优于最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

GeoRecover: Recovery From Poisoning Attacks for LDP-Enabled Spatial Density Aggregation

The spatial density distribution collected and aggregated from users’ trajectory data is vital for location-based services like regional popularity analysis and congestion measurement. However, spatial density aggregation poses privacy concerns since trajectory data usually originate from users. Local differential privacy (LDP) addresses these concerns by allowing users to perturb their data before reporting it. Yet, LDP is vulnerable to poisoning attacks where attackers manipulate data from malicious users. Recent studies attempt to defend against such attacks in LDP-enabled frequency estimation but suffer from inaccurate data recovery due to empirical presets of malicious user proportions and inaccurate malicious data estimation. These issues worsen in spatial density aggregation, as high-dimensional trajectory data help conceal malicious information. In this work, we propose GeoRecover, a method to defend against poisoning attacks in LDP-enabled spatial density aggregation by addressing previous limitations. GeoRecover designs an adaptive model to unify these attacks. Under this model, GeoRecover estimates the proportion of malicious users using statistical differences between genuine and malicious data and learns malicious data statistics through LDP properties. This allows GeoRecover to recover accurate spatial density distribution by subtracting malicious users’ contributions. Evaluations on two real-world datasets show GeoRecover outperforms state-of-the-art methods in recovery accuracy, defense capability, and practical performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.