{"title":"NPAT Null-Space Projected Adversarial Training Towards Zero Deterioration","authors":"Hanyi Hu, Qiao Han, Kui Chen, Yao Yang","doi":"arxiv-2409.11754","DOIUrl":null,"url":null,"abstract":"To mitigate the susceptibility of neural networks to adversarial attacks,\nadversarial training has emerged as a prevalent and effective defense strategy.\nIntrinsically, this countermeasure incurs a trade-off, as it sacrifices the\nmodel's accuracy in processing normal samples. To reconcile the trade-off, we\npioneer the incorporation of null-space projection into adversarial training\nand propose two innovative Null-space Projection based Adversarial\nTraining(NPAT) algorithms tackling sample generation and gradient optimization,\nnamed Null-space Projected Data Augmentation (NPDA) and Null-space Projected\nGradient Descent (NPGD), to search for an overarching optimal solutions, which\nenhance robustness with almost zero deterioration in generalization\nperformance. Adversarial samples and perturbations are constrained within the\nnull-space of the decision boundary utilizing a closed-form null-space\nprojector, effectively mitigating threat of attack stemming from unreliable\nfeatures. Subsequently, we conducted experiments on the CIFAR10 and SVHN\ndatasets and reveal that our methodology can seamlessly combine with\nadversarial training methods and obtain comparable robustness while keeping\ngeneralization close to a high-accuracy model.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11754","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
To mitigate the susceptibility of neural networks to adversarial attacks,
adversarial training has emerged as a prevalent and effective defense strategy.
Intrinsically, this countermeasure incurs a trade-off, as it sacrifices the
model's accuracy in processing normal samples. To reconcile the trade-off, we
pioneer the incorporation of null-space projection into adversarial training
and propose two innovative Null-space Projection based Adversarial
Training(NPAT) algorithms tackling sample generation and gradient optimization,
named Null-space Projected Data Augmentation (NPDA) and Null-space Projected
Gradient Descent (NPGD), to search for an overarching optimal solutions, which
enhance robustness with almost zero deterioration in generalization
performance. Adversarial samples and perturbations are constrained within the
null-space of the decision boundary utilizing a closed-form null-space
projector, effectively mitigating threat of attack stemming from unreliable
features. Subsequently, we conducted experiments on the CIFAR10 and SVHN
datasets and reveal that our methodology can seamlessly combine with
adversarial training methods and obtain comparable robustness while keeping
generalization close to a high-accuracy model.