PAD-FT: A Lightweight Defense for Backdoor Attacks via Data Purification and Fine-Tuning

Yukai Xu, Yujie Gu, Kouichi Sakurai
{"title":"PAD-FT: A Lightweight Defense for Backdoor Attacks via Data Purification and Fine-Tuning","authors":"Yukai Xu, Yujie Gu, Kouichi Sakurai","doi":"arxiv-2409.12072","DOIUrl":null,"url":null,"abstract":"Backdoor attacks pose a significant threat to deep neural networks,\nparticularly as recent advancements have led to increasingly subtle\nimplantation, making the defense more challenging. Existing defense mechanisms\ntypically rely on an additional clean dataset as a standard reference and\ninvolve retraining an auxiliary model or fine-tuning the entire victim model.\nHowever, these approaches are often computationally expensive and not always\nfeasible in practical applications. In this paper, we propose a novel and\nlightweight defense mechanism, termed PAD-FT, that does not require an\nadditional clean dataset and fine-tunes only a very small part of the model to\ndisinfect the victim model. To achieve this, our approach first introduces a\nsimple data purification process to identify and select the most-likely clean\ndata from the poisoned training dataset. The self-purified clean dataset is\nthen used for activation clipping and fine-tuning only the last classification\nlayer of the victim model. By integrating data purification, activation\nclipping, and classifier fine-tuning, our mechanism PAD-FT demonstrates\nsuperior effectiveness across multiple backdoor attack methods and datasets, as\nconfirmed through extensive experimental evaluation.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Cryptography and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.12072","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Backdoor attacks pose a significant threat to deep neural networks, particularly as recent advancements have led to increasingly subtle implantation, making the defense more challenging. Existing defense mechanisms typically rely on an additional clean dataset as a standard reference and involve retraining an auxiliary model or fine-tuning the entire victim model. However, these approaches are often computationally expensive and not always feasible in practical applications. In this paper, we propose a novel and lightweight defense mechanism, termed PAD-FT, that does not require an additional clean dataset and fine-tunes only a very small part of the model to disinfect the victim model. To achieve this, our approach first introduces a simple data purification process to identify and select the most-likely clean data from the poisoned training dataset. The self-purified clean dataset is then used for activation clipping and fine-tuning only the last classification layer of the victim model. By integrating data purification, activation clipping, and classifier fine-tuning, our mechanism PAD-FT demonstrates superior effectiveness across multiple backdoor attack methods and datasets, as confirmed through extensive experimental evaluation.
PAD-FT:通过数据净化和微调实现对后门攻击的轻量级防御
后门攻击对深度神经网络构成了重大威胁,尤其是最近的技术进步导致了越来越微妙的植入,使得防御更具挑战性。现有的防御机制通常依赖于额外的干净数据集作为标准参考,并涉及重新训练辅助模型或微调整个受害者模型。然而,这些方法通常计算成本高昂,在实际应用中并不总是可行的。在本文中,我们提出了一种称为 PAD-FT 的新型轻量级防御机制,它不需要额外的干净数据集,只需对模型的一小部分进行微调,即可感染受害者模型。为了实现这一目标,我们的方法首先引入了一个简单的数据净化过程,从中毒训练数据集中识别并选择最有可能的干净数据。然后,自我净化的干净数据集仅用于受害者模型最后一个分类层的激活削波和微调。通过整合数据净化、激活剪切和分类器微调,我们的机制 PAD-FT 在多种后门攻击方法和数据集上都表现出了更高的有效性,这一点已经通过广泛的实验评估得到了证实。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信