Zhenya Wang;Xiang Cheng;Sen Su;Jintao Liang;Haocheng Yang
{"title":"ATLAS: GAN-Based Differentially Private Multi-Party Data Sharing","authors":"Zhenya Wang;Xiang Cheng;Sen Su;Jintao Liang;Haocheng Yang","doi":"10.1109/TBDATA.2023.3277716","DOIUrl":null,"url":null,"abstract":"In this article, we study the problem of differentially private multi-party data sharing, where the involved parties assisted by a semi-honest curator collectively generate a shared dataset while satisfying differential privacy. Inspired by the success of data synthesis with the generative adversarial network (GAN), we propose a novel GAN-based differentially private multi-party data sharing approach named ATLAS. In ATLAS, we extend the original GAN to multiple discriminators, and let each party hold a discriminator while the curator holds a generator. To update the generator without compromising each party's privacy, we decompose the calculation of the generator's gradient and selectively sanitize the \n<italic>discriminators’ responses</i>\n. Additionally, we propose two methods to improve the utility of shared data, i.e., the collaborative discriminator filtering (CDF) method and the adaptive gradient perturbation (AGP) method. Specifically, the CDF method utilizes trained discriminators to refine synthetic records, while the AGP method adaptively adjusts the noise scale during training to reduce the impact of deferentially private noise on the final shared data. Extensive experiments on real-world datasets validate the superiority of our ATLAS approach.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"9 4","pages":"1225-1237"},"PeriodicalIF":7.5000,"publicationDate":"2023-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10129015/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 1
Abstract
In this article, we study the problem of differentially private multi-party data sharing, where the involved parties assisted by a semi-honest curator collectively generate a shared dataset while satisfying differential privacy. Inspired by the success of data synthesis with the generative adversarial network (GAN), we propose a novel GAN-based differentially private multi-party data sharing approach named ATLAS. In ATLAS, we extend the original GAN to multiple discriminators, and let each party hold a discriminator while the curator holds a generator. To update the generator without compromising each party's privacy, we decompose the calculation of the generator's gradient and selectively sanitize the
discriminators’ responses
. Additionally, we propose two methods to improve the utility of shared data, i.e., the collaborative discriminator filtering (CDF) method and the adaptive gradient perturbation (AGP) method. Specifically, the CDF method utilizes trained discriminators to refine synthetic records, while the AGP method adaptively adjusts the noise scale during training to reduce the impact of deferentially private noise on the final shared data. Extensive experiments on real-world datasets validate the superiority of our ATLAS approach.
期刊介绍:
The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.