Xiaohong Cai, Yi Sun, Zhaowen Lin, Ripeng Li, Tianwei Cai
{"title":"Differentially private synthetic data generation for robust information fusion","authors":"Xiaohong Cai, Yi Sun, Zhaowen Lin, Ripeng Li, Tianwei Cai","doi":"10.1016/j.inffus.2025.103373","DOIUrl":null,"url":null,"abstract":"<div><div>Synthetic data is crucial in information fusion in term of enhancing data representation and improving system robustness. Among all synthesis methods, deep generative models exhibit excellent performance. However, recent studies have shown that the generation process faces privacy challenges due to the memorization of training instances by generative models. To maximize the benefits of synthesis data while ensuring data security, we propose a novel framework for the generation and utilization of private synthetic data in information fusion processes. Furthermore, we present differential private adaptive fine-tuning (DP-AdaFit), a method for private parameter efficient fine-tuning that applies differential privacy only to the singular values of the incremental updates. In details, DP-AdaFit adaptively adjusts the rank of the low-rank weight increment matrices according to their importance score, and allows us to achieve an equivalent privacy policy by only injecting noise into gradient of the corresponding singular values. Such a novel approach essentially reduces their parameter budget but avoids too much noise introduced by the singular value decomposition. We decrease the cost on memory and computation nearly half of the SOTA, and achieve the FID of 19.2 on CIFAR10. Our results demonstrate that trading off weights contained in the differential privacy fine-tuning parameters can improve model performance, even achieving generation quality competitive with differential privacy full fine-tuning diffusion model. Our code is available at <span><span>DP-AdaFit</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103373"},"PeriodicalIF":14.7000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525004464","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Synthetic data is crucial in information fusion in term of enhancing data representation and improving system robustness. Among all synthesis methods, deep generative models exhibit excellent performance. However, recent studies have shown that the generation process faces privacy challenges due to the memorization of training instances by generative models. To maximize the benefits of synthesis data while ensuring data security, we propose a novel framework for the generation and utilization of private synthetic data in information fusion processes. Furthermore, we present differential private adaptive fine-tuning (DP-AdaFit), a method for private parameter efficient fine-tuning that applies differential privacy only to the singular values of the incremental updates. In details, DP-AdaFit adaptively adjusts the rank of the low-rank weight increment matrices according to their importance score, and allows us to achieve an equivalent privacy policy by only injecting noise into gradient of the corresponding singular values. Such a novel approach essentially reduces their parameter budget but avoids too much noise introduced by the singular value decomposition. We decrease the cost on memory and computation nearly half of the SOTA, and achieve the FID of 19.2 on CIFAR10. Our results demonstrate that trading off weights contained in the differential privacy fine-tuning parameters can improve model performance, even achieving generation quality competitive with differential privacy full fine-tuning diffusion model. Our code is available at DP-AdaFit.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.