Towards Accurate Truth Discovery With Privacy-Preserving Over Crowdsourced Data Streams

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-01-29 DOI:10.1109/TKDE.2025.3536180

Zhimao Gong;Zhibang Yang;Shenghong Yang;Siyang Yu;Kenli Li;Mingxing Duan

{"title":"Towards Accurate Truth Discovery With Privacy-Preserving Over Crowdsourced Data Streams","authors":"Zhimao Gong;Zhibang Yang;Shenghong Yang;Siyang Yu;Kenli Li;Mingxing Duan","doi":"10.1109/TKDE.2025.3536180","DOIUrl":null,"url":null,"abstract":"Truth discovery endeavors to extract valuable information from multi-source data through weighted aggregation. Some studies have integrated differential privacy techniques into traditional truth discovery algorithms to protect data privacy. However, due to the neglect of outliers and limitations in budget allocation, these schemes still need improvement in the accuracy of discovery results. To solve these challenges, we propose a privacy-preserving scheme called PriPTD to achieve secure and accurate truth discovery services over crowdsourced data streams. Instead of assuming that worker weights are always stable between two neighboring timestamps, we delve deeper to consider outliers where worker weights change rapidly. Accordingly, we develop an outlier-aware weight estimation method with a time series model to capture and handle these outliers. Furthermore, to ensure data utility under a limited budget, we devise a weight-aware budget allocation algorithm. Its core idea is that timestamps with higher importance consume a larger proportion of the remaining budget. Additionally, we design a noise-aware error adjustment approach to mitigate the adverse effects of introduced noise on accuracy. Theoretical analysis and extensive experiments validate our scheme. Final comparative experiments against existing works confirm that our scheme achieves more accurate truth discovery while preserving privacy.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 4","pages":"2155-2168"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10857414/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Truth discovery endeavors to extract valuable information from multi-source data through weighted aggregation. Some studies have integrated differential privacy techniques into traditional truth discovery algorithms to protect data privacy. However, due to the neglect of outliers and limitations in budget allocation, these schemes still need improvement in the accuracy of discovery results. To solve these challenges, we propose a privacy-preserving scheme called PriPTD to achieve secure and accurate truth discovery services over crowdsourced data streams. Instead of assuming that worker weights are always stable between two neighboring timestamps, we delve deeper to consider outliers where worker weights change rapidly. Accordingly, we develop an outlier-aware weight estimation method with a time series model to capture and handle these outliers. Furthermore, to ensure data utility under a limited budget, we devise a weight-aware budget allocation algorithm. Its core idea is that timestamps with higher importance consume a larger proportion of the remaining budget. Additionally, we design a noise-aware error adjustment approach to mitigate the adverse effects of introduced noise on accuracy. Theoretical analysis and extensive experiments validate our scheme. Final comparative experiments against existing works confirm that our scheme achieves more accurate truth discovery while preserving privacy.

查看原文本刊更多论文

在众包数据流上实现准确的真相发现和隐私保护

真相发现是通过加权聚合从多源数据中提取有价值的信息。一些研究将差分隐私技术集成到传统的真相发现算法中，以保护数据隐私。然而，由于忽略了异常值和预算分配的限制，这些方案在发现结果的准确性方面仍有待提高。为了解决这些问题，我们提出了一种名为PriPTD的隐私保护方案，以实现对众包数据流的安全、准确的真相发现服务。我们没有假设工作权值在两个相邻的时间戳之间总是稳定的，而是更深入地考虑工作权值变化迅速的异常值。因此，我们开发了一种具有时间序列模型的异常值感知权重估计方法来捕获和处理这些异常值。此外，为了确保有限预算下的数据效用，我们设计了一个权重感知的预算分配算法。其核心思想是，重要性越高的时间戳消耗的剩余预算比例越大。此外，我们设计了一种噪声感知误差调整方法，以减轻引入噪声对精度的不利影响。理论分析和大量实验验证了该方案的有效性。最后通过与已有作品的对比实验，证实了我们的方案在保护隐私的前提下实现了更准确的真相发现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.