Update Selective Parameters: Federated Machine Unlearning Based on Model Explanation

IF 7.5 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Big Data Pub Date : 2024-06-05 DOI:10.1109/TBDATA.2024.3409947

Heng Xu;Tianqing Zhu;Lefeng Zhang;Wanlei Zhou;Philip S. Yu

{"title":"Update Selective Parameters: Federated Machine Unlearning Based on Model Explanation","authors":"Heng Xu;Tianqing Zhu;Lefeng Zhang;Wanlei Zhou;Philip S. Yu","doi":"10.1109/TBDATA.2024.3409947","DOIUrl":null,"url":null,"abstract":"Federated learning is a promising privacy-preserving paradigm for distributed machine learning. In this context, there is sometimes a need for a specialized process called machine unlearning, which is required when the effect of some specific training samples needs to be removed from a learning model due to privacy, security, usability, and/or legislative factors. However, problems arise when current centralized unlearning methods are applied to existing federated learning, in which the server aims to remove all information about a class from the global model. Centralized unlearning usually focuses on simple models or is premised on the ability to access all training data at a central node. However, training data cannot be accessed on the server under the federated learning paradigm, conflicting with the requirements of the centralized unlearning process. Additionally, there are high computation and communication costs associated with accessing clients’ data, especially in scenarios involving numerous clients or complex global models. To address these concerns, we propose a more effective and efficient federated unlearning scheme based on the concept of model explanation. Model explanation involves understanding deep networks and individual channel importance, so that this understanding can be used to determine which model channels are critical for classes that need to be unlearned. We select the most influential channels within an already-trained model for the data that need to be unlearned and fine-tune only influential channels to remove the contribution made by those data. In this way, we can simultaneously avoid huge consumption costs and ensure that the unlearned model maintains good performance. Experiments with different training models on various datasets demonstrate the effectiveness of the proposed approach.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 2","pages":"524-539"},"PeriodicalIF":7.5000,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10549794/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Federated learning is a promising privacy-preserving paradigm for distributed machine learning. In this context, there is sometimes a need for a specialized process called machine unlearning, which is required when the effect of some specific training samples needs to be removed from a learning model due to privacy, security, usability, and/or legislative factors. However, problems arise when current centralized unlearning methods are applied to existing federated learning, in which the server aims to remove all information about a class from the global model. Centralized unlearning usually focuses on simple models or is premised on the ability to access all training data at a central node. However, training data cannot be accessed on the server under the federated learning paradigm, conflicting with the requirements of the centralized unlearning process. Additionally, there are high computation and communication costs associated with accessing clients’ data, especially in scenarios involving numerous clients or complex global models. To address these concerns, we propose a more effective and efficient federated unlearning scheme based on the concept of model explanation. Model explanation involves understanding deep networks and individual channel importance, so that this understanding can be used to determine which model channels are critical for classes that need to be unlearned. We select the most influential channels within an already-trained model for the data that need to be unlearned and fine-tune only influential channels to remove the contribution made by those data. In this way, we can simultaneously avoid huge consumption costs and ensure that the unlearned model maintains good performance. Experiments with different training models on various datasets demonstrate the effectiveness of the proposed approach.

查看原文本刊更多论文

更新选择参数：基于模型解释的联邦机器学习

联邦学习是分布式机器学习中一个很有前途的隐私保护范例。在这种情况下，有时需要一个称为机器学习的专门过程，当由于隐私、安全、可用性和/或立法因素需要从学习模型中删除某些特定训练样本的影响时，就需要这种过程。然而，当当前的集中式取消学习方法应用于现有的联邦学习时，问题就出现了，在联邦学习中，服务器的目标是从全局模型中删除关于类的所有信息。集中式学习通常侧重于简单模型，或者以访问中心节点上所有训练数据的能力为前提。然而，在联邦学习范式下，训练数据无法在服务器上访问，这与集中式学习过程的要求相冲突。此外，与访问客户端数据相关的计算和通信成本很高，特别是在涉及众多客户端或复杂全局模型的场景中。为了解决这些问题，我们提出了一种基于模型解释概念的更有效的联合学习方案。模型解释涉及到理解深层网络和单个通道的重要性，因此这种理解可以用来确定哪些模型通道对于需要遗忘的类是至关重要的。我们在一个已经训练好的模型中选择最具影响力的渠道，以剔除需要学习的数据，并只微调有影响力的渠道，以消除这些数据的贡献。这样既可以避免巨大的消耗成本，又可以保证未学习模型保持良好的性能。在不同的数据集上用不同的训练模型进行的实验证明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Big Data Multiple-

CiteScore

11.80

自引率

2.80%

发文量

114

期刊介绍： The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.