基于鲁棒信息融合的差分私有合成数据生成

IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Xiaohong Cai, Yi Sun, Zhaowen Lin, Ripeng Li, Tianwei Cai
{"title":"基于鲁棒信息融合的差分私有合成数据生成","authors":"Xiaohong Cai,&nbsp;Yi Sun,&nbsp;Zhaowen Lin,&nbsp;Ripeng Li,&nbsp;Tianwei Cai","doi":"10.1016/j.inffus.2025.103373","DOIUrl":null,"url":null,"abstract":"<div><div>Synthetic data is crucial in information fusion in term of enhancing data representation and improving system robustness. Among all synthesis methods, deep generative models exhibit excellent performance. However, recent studies have shown that the generation process faces privacy challenges due to the memorization of training instances by generative models. To maximize the benefits of synthesis data while ensuring data security, we propose a novel framework for the generation and utilization of private synthetic data in information fusion processes. Furthermore, we present differential private adaptive fine-tuning (DP-AdaFit), a method for private parameter efficient fine-tuning that applies differential privacy only to the singular values of the incremental updates. In details, DP-AdaFit adaptively adjusts the rank of the low-rank weight increment matrices according to their importance score, and allows us to achieve an equivalent privacy policy by only injecting noise into gradient of the corresponding singular values. Such a novel approach essentially reduces their parameter budget but avoids too much noise introduced by the singular value decomposition. We decrease the cost on memory and computation nearly half of the SOTA, and achieve the FID of 19.2 on CIFAR10. Our results demonstrate that trading off weights contained in the differential privacy fine-tuning parameters can improve model performance, even achieving generation quality competitive with differential privacy full fine-tuning diffusion model. Our code is available at <span><span>DP-AdaFit</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103373"},"PeriodicalIF":14.7000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Differentially private synthetic data generation for robust information fusion\",\"authors\":\"Xiaohong Cai,&nbsp;Yi Sun,&nbsp;Zhaowen Lin,&nbsp;Ripeng Li,&nbsp;Tianwei Cai\",\"doi\":\"10.1016/j.inffus.2025.103373\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Synthetic data is crucial in information fusion in term of enhancing data representation and improving system robustness. Among all synthesis methods, deep generative models exhibit excellent performance. However, recent studies have shown that the generation process faces privacy challenges due to the memorization of training instances by generative models. To maximize the benefits of synthesis data while ensuring data security, we propose a novel framework for the generation and utilization of private synthetic data in information fusion processes. Furthermore, we present differential private adaptive fine-tuning (DP-AdaFit), a method for private parameter efficient fine-tuning that applies differential privacy only to the singular values of the incremental updates. In details, DP-AdaFit adaptively adjusts the rank of the low-rank weight increment matrices according to their importance score, and allows us to achieve an equivalent privacy policy by only injecting noise into gradient of the corresponding singular values. Such a novel approach essentially reduces their parameter budget but avoids too much noise introduced by the singular value decomposition. We decrease the cost on memory and computation nearly half of the SOTA, and achieve the FID of 19.2 on CIFAR10. Our results demonstrate that trading off weights contained in the differential privacy fine-tuning parameters can improve model performance, even achieving generation quality competitive with differential privacy full fine-tuning diffusion model. Our code is available at <span><span>DP-AdaFit</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"124 \",\"pages\":\"Article 103373\"},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2025-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1566253525004464\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525004464","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

在信息融合中,合成数据是增强数据表示能力和提高系统鲁棒性的关键。在所有的综合方法中,深度生成模型表现出优异的性能。然而,最近的研究表明,由于生成模型对训练实例的记忆,生成过程面临隐私挑战。为了最大限度地发挥综合数据的效益,同时保证数据的安全性,我们提出了一种新的信息融合过程中私有综合数据的生成和利用框架。此外,我们提出微分私有自适应微调(DP-AdaFit),这是一种仅对增量更新的奇异值应用微分隐私的私有参数有效微调方法。DP-AdaFit根据低秩权重增量矩阵的重要性评分自适应调整其秩,并允许我们仅通过在相应奇异值的梯度中注入噪声来实现等效的隐私策略。这种新方法在减少参数预算的同时,避免了奇异值分解带来的过多噪声。我们将内存和计算成本降低了近一半,并在CIFAR10上实现了19.2的FID。我们的研究结果表明,权衡差分隐私微调参数中包含的权重可以提高模型性能,甚至可以实现与差分隐私完全微调扩散模型竞争的生成质量。我们的代码可以在DP-AdaFit上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Differentially private synthetic data generation for robust information fusion
Synthetic data is crucial in information fusion in term of enhancing data representation and improving system robustness. Among all synthesis methods, deep generative models exhibit excellent performance. However, recent studies have shown that the generation process faces privacy challenges due to the memorization of training instances by generative models. To maximize the benefits of synthesis data while ensuring data security, we propose a novel framework for the generation and utilization of private synthetic data in information fusion processes. Furthermore, we present differential private adaptive fine-tuning (DP-AdaFit), a method for private parameter efficient fine-tuning that applies differential privacy only to the singular values of the incremental updates. In details, DP-AdaFit adaptively adjusts the rank of the low-rank weight increment matrices according to their importance score, and allows us to achieve an equivalent privacy policy by only injecting noise into gradient of the corresponding singular values. Such a novel approach essentially reduces their parameter budget but avoids too much noise introduced by the singular value decomposition. We decrease the cost on memory and computation nearly half of the SOTA, and achieve the FID of 19.2 on CIFAR10. Our results demonstrate that trading off weights contained in the differential privacy fine-tuning parameters can improve model performance, even achieving generation quality competitive with differential privacy full fine-tuning diffusion model. Our code is available at DP-AdaFit.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信