Preserving Privacy in Fine-Grained Data Distillation With Sparse Answers for Efficient Edge Computing

IF 8.9 1区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Ke Pan;Maoguo Gong;Kaiyuan Feng;Hui Li
{"title":"Preserving Privacy in Fine-Grained Data Distillation With Sparse Answers for Efficient Edge Computing","authors":"Ke Pan;Maoguo Gong;Kaiyuan Feng;Hui Li","doi":"10.1109/JIOT.2024.3508804","DOIUrl":null,"url":null,"abstract":"In the field of Internet of Things (IoT), data distillation has been thought of as a key method to condense the original real dataset into a tiny synthetic dataset with less training burden while maintaining as much data utility as possible for training deep learning models. However, the data synthesis process may remember some sensitive information about the original dataset, which may raise privacy concerns for data owners. To address this problem, we present a novel differential privacy (DP)-based data distillation algorithm. Specifically, in the data distillation phase, we first randomly pick a training model from the model pool in each epoch, and then build a fine-grained distribution matching to generate informative data for improving the task-oriented model performance. In the privacy preservation phase, we selectively perturb input features that are more important for model training based on the sparse vector technique to protect the sensitive information contained in the original dataset and reduce privacy costs. Extensive experiments across several real-world datasets demonstrate that our algorithm can achieve higher data utility and model accuracy than existing solutions.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 8","pages":"10058-10069"},"PeriodicalIF":8.9000,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10786879/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

In the field of Internet of Things (IoT), data distillation has been thought of as a key method to condense the original real dataset into a tiny synthetic dataset with less training burden while maintaining as much data utility as possible for training deep learning models. However, the data synthesis process may remember some sensitive information about the original dataset, which may raise privacy concerns for data owners. To address this problem, we present a novel differential privacy (DP)-based data distillation algorithm. Specifically, in the data distillation phase, we first randomly pick a training model from the model pool in each epoch, and then build a fine-grained distribution matching to generate informative data for improving the task-oriented model performance. In the privacy preservation phase, we selectively perturb input features that are more important for model training based on the sparse vector technique to protect the sensitive information contained in the original dataset and reduce privacy costs. Extensive experiments across several real-world datasets demonstrate that our algorithm can achieve higher data utility and model accuracy than existing solutions.
基于稀疏答案的细粒度数据蒸馏中的隐私保护
在物联网(IoT)领域,数据蒸馏一直被认为是将原始真实数据集压缩成具有较少训练负担的微小合成数据集,同时保持尽可能多的数据效用以训练深度学习模型的关键方法。然而,数据合成过程可能会记住有关原始数据集的一些敏感信息,这可能会引起数据所有者的隐私问题。为了解决这个问题,我们提出了一种新的基于差分隐私(DP)的数据蒸馏算法。具体而言,在数据蒸馏阶段,我们首先从每个epoch的模型池中随机抽取一个训练模型,然后构建一个细粒度的分布匹配来生成信息数据,以提高面向任务的模型性能。在隐私保护阶段,我们基于稀疏向量技术选择性地扰动对模型训练更重要的输入特征,以保护原始数据集中包含的敏感信息,降低隐私成本。在多个真实数据集上进行的大量实验表明,我们的算法可以比现有解决方案实现更高的数据效用和模型准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Internet of Things Journal
IEEE Internet of Things Journal Computer Science-Information Systems
CiteScore
17.60
自引率
13.20%
发文量
1982
期刊介绍: The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信