Finetune and Label Reversal: Privacy-preserving unlearning strategies for GAN models in cloud computing

IF 3.1 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computer Standards & Interfaces Pub Date : 2025-02-06 DOI:10.1016/j.csi.2025.103976

Lang Li , Pei-gen Ye , Zhengdao Li , Zuopeng Yang , Zhenxin Zhang

{"title":"Finetune and Label Reversal: Privacy-preserving unlearning strategies for GAN models in cloud computing","authors":"Lang Li , Pei-gen Ye , Zhengdao Li , Zuopeng Yang , Zhenxin Zhang","doi":"10.1016/j.csi.2025.103976","DOIUrl":null,"url":null,"abstract":"<div><div>With the increasing emphasis on data protection by governments, machine unlearning has become a highly researched and prominent topic of interest. Machine unlearning is the process of eliminating the influence of specific samples from a machine learning model. Currently, most work on machine unlearning focuses on supervised learning, with limited research on unsupervised learning models such as GANs (Generative Adversarial Networks). GANs, as generative models, are widely applied in cloud computing platforms to generate high-quality synthetic data for various applications, including image synthesis, data augmentation, and anomaly detection. However, these models are often trained on large datasets that may contain personal or sensitive information, raising concerns about data privacy in cloud environments. Given the structural differences between GANs and traditional supervised learning models, transferring classical supervised unlearning algorithms to GANs poses significant challenges. Furthermore, the evaluation metrics for supervised learning unlearning algorithms are not directly applicable to GANs. To address these challenges, we propose two novel methods for unlearning in GANs: Finetune and Label Reversal. The Finetune methodology extends supervised learning unlearning by channeling residual data back into a pretrained GAN model for further refinement. Label Reversal involves reversing the labels of unlearning samples and performing iterative training to neutralize their influence on the model. To meet the needs of cloud-based GAN applications, we also introduce an evaluation metric tailored to GAN unlearning based on prediction loss. This metric ensures the reliability of unlearning methods while maintaining the quality of synthetic data generated in cloud environments. Extensive experiments conducted on the SVHN, CIFAR10, and CIFAR100 datasets demonstrate the efficiency of our methods. Our approach effectively removes specific samples from GAN models while preserving their generative capabilities, making it highly suitable for privacy-preserving GAN applications in cloud computing.</div></div>","PeriodicalId":50635,"journal":{"name":"Computer Standards & Interfaces","volume":"93 ","pages":"Article 103976"},"PeriodicalIF":3.1000,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Standards & Interfaces","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0920548925000054","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

With the increasing emphasis on data protection by governments, machine unlearning has become a highly researched and prominent topic of interest. Machine unlearning is the process of eliminating the influence of specific samples from a machine learning model. Currently, most work on machine unlearning focuses on supervised learning, with limited research on unsupervised learning models such as GANs (Generative Adversarial Networks). GANs, as generative models, are widely applied in cloud computing platforms to generate high-quality synthetic data for various applications, including image synthesis, data augmentation, and anomaly detection. However, these models are often trained on large datasets that may contain personal or sensitive information, raising concerns about data privacy in cloud environments. Given the structural differences between GANs and traditional supervised learning models, transferring classical supervised unlearning algorithms to GANs poses significant challenges. Furthermore, the evaluation metrics for supervised learning unlearning algorithms are not directly applicable to GANs. To address these challenges, we propose two novel methods for unlearning in GANs: Finetune and Label Reversal. The Finetune methodology extends supervised learning unlearning by channeling residual data back into a pretrained GAN model for further refinement. Label Reversal involves reversing the labels of unlearning samples and performing iterative training to neutralize their influence on the model. To meet the needs of cloud-based GAN applications, we also introduce an evaluation metric tailored to GAN unlearning based on prediction loss. This metric ensures the reliability of unlearning methods while maintaining the quality of synthetic data generated in cloud environments. Extensive experiments conducted on the SVHN, CIFAR10, and CIFAR100 datasets demonstrate the efficiency of our methods. Our approach effectively removes specific samples from GAN models while preserving their generative capabilities, making it highly suitable for privacy-preserving GAN applications in cloud computing.

查看原文本刊更多论文

微调和标签反转：云计算中GAN模型的隐私保护学习策略

随着政府对数据保护的日益重视，机器学习已经成为一个被高度研究和关注的突出话题。机器学习是消除机器学习模型中特定样本影响的过程。目前，大多数关于机器学习的研究都集中在监督学习上，而对gan（生成对抗网络）等无监督学习模型的研究很少。gan作为生成模型被广泛应用于云计算平台，为图像合成、数据增强、异常检测等各种应用生成高质量的合成数据。然而，这些模型通常是在可能包含个人或敏感信息的大型数据集上进行训练的，这引起了人们对云环境中数据隐私的担忧。考虑到gan与传统监督学习模型的结构差异，将经典的监督学习算法转移到gan中面临着重大挑战。此外，监督学习反学习算法的评价指标并不直接适用于gan。为了解决这些挑战，我们提出了两种新的gan学习方法：Finetune和Label Reversal。Finetune方法通过将残差数据导入预训练的GAN模型以进一步细化，扩展了监督学习。标签反转包括反转未学习样本的标签，并执行迭代训练来中和它们对模型的影响。为了满足基于云的GAN应用的需求，我们还引入了一种基于预测损失的GAN学习评估指标。该指标确保了学习方法的可靠性，同时保持了云环境中生成的合成数据的质量。在SVHN、CIFAR10和CIFAR100数据集上进行的大量实验证明了我们的方法的有效性。我们的方法有效地从GAN模型中去除特定样本，同时保留其生成能力，使其非常适合云计算中保护隐私的GAN应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Standards & Interfaces 工程技术-计算机：软件工程

CiteScore

11.90

自引率

16.00%

发文量

审稿时长

6 months

期刊介绍： The quality of software, well-defined interfaces (hardware and software), the process of digitalisation, and accepted standards in these fields are essential for building and exploiting complex computing, communication, multimedia and measuring systems. Standards can simplify the design and construction of individual hardware and software components and help to ensure satisfactory interworking. Computer Standards & Interfaces is an international journal dealing specifically with these topics. The journal • Provides information about activities and progress on the definition of computer standards, software quality, interfaces and methods, at national, European and international levels • Publishes critical comments on standards and standards activities • Disseminates user''s experiences and case studies in the application and exploitation of established or emerging standards, interfaces and methods • Offers a forum for discussion on actual projects, standards, interfaces and methods by recognised experts • Stimulates relevant research by providing a specialised refereed medium.