使用随机编码网络的对抗性净化

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing Pub Date : 2025-07-24 DOI:10.1016/j.asoc.2025.113604

Yuxin Gong, Shen Wang, Xunzhi Jiang, Tingyue Yu, Fanghui Sun

{"title":"使用随机编码网络的对抗性净化","authors":"Yuxin Gong, Shen Wang, Xunzhi Jiang, Tingyue Yu, Fanghui Sun","doi":"10.1016/j.asoc.2025.113604","DOIUrl":null,"url":null,"abstract":"<div><div>Deep neural networks (DNNs) have revealed vulnerabilities to adversarial examples, which can deceive models with high confidence. This has given rise to serious threats in security-critical domains. Adversarial defense methods have been extensively studied to counter adversarial attacks. Adversarial purification, as a major defense strategy, attempts to recover adversarial examples to clean counterparts by filtering out perturbations. However, many purification defenses struggle against white-box attacks where the target and defense models are known. Additionally, the training processes against specific attacks can compromise models’ adaptability to unknown attacks, and purification operations may destroy key features of inputs. In this paper, we propose the random encoding network (REN), which consists of a random encoding denoiser and a diverse classifier to enhance the robustness of adversarial purification defense models. The internal part of the denoiser leverages adversarial sparse coding to purify examples by filtering out perturbations and noise as much as possible while preserving critical features of inputs. The external part of the denoiser employs a dynamic random mechanism to implement random encoding, thereby enhancing the models’ uncertainty. Moreover, the classifier is subjected to a diversity constraint to promote variation among random sub-models. Experimental results demonstrate that REN exhibits strong defensive generalization capabilities, effectively countering adversarial examples across diverse attack types and settings. For the CIFAR-10 and SVHN datasets, the clean-trained REN achieves average adversarial accuracies of 63.26% and 59.78% against white-box attacks, while the adversarial-trained REN achieves 68.27% and 72.39%, respectively. When faced with unknown attack scenarios, REN is more effective than state-of-the-art defense methods.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"183 ","pages":"Article 113604"},"PeriodicalIF":7.2000,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adversarial purification using random encoding networks\",\"authors\":\"Yuxin Gong, Shen Wang, Xunzhi Jiang, Tingyue Yu, Fanghui Sun\",\"doi\":\"10.1016/j.asoc.2025.113604\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Deep neural networks (DNNs) have revealed vulnerabilities to adversarial examples, which can deceive models with high confidence. This has given rise to serious threats in security-critical domains. Adversarial defense methods have been extensively studied to counter adversarial attacks. Adversarial purification, as a major defense strategy, attempts to recover adversarial examples to clean counterparts by filtering out perturbations. However, many purification defenses struggle against white-box attacks where the target and defense models are known. Additionally, the training processes against specific attacks can compromise models’ adaptability to unknown attacks, and purification operations may destroy key features of inputs. In this paper, we propose the random encoding network (REN), which consists of a random encoding denoiser and a diverse classifier to enhance the robustness of adversarial purification defense models. The internal part of the denoiser leverages adversarial sparse coding to purify examples by filtering out perturbations and noise as much as possible while preserving critical features of inputs. The external part of the denoiser employs a dynamic random mechanism to implement random encoding, thereby enhancing the models’ uncertainty. Moreover, the classifier is subjected to a diversity constraint to promote variation among random sub-models. Experimental results demonstrate that REN exhibits strong defensive generalization capabilities, effectively countering adversarial examples across diverse attack types and settings. For the CIFAR-10 and SVHN datasets, the clean-trained REN achieves average adversarial accuracies of 63.26% and 59.78% against white-box attacks, while the adversarial-trained REN achieves 68.27% and 72.39%, respectively. When faced with unknown attack scenarios, REN is more effective than state-of-the-art defense methods.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"183 \",\"pages\":\"Article 113604\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2025-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625009159\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625009159","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

深度神经网络（dnn）已经暴露出对抗性示例的脆弱性，这些示例可以欺骗具有高置信度的模型。这给安全关键领域带来了严重的威胁。对抗性防御方法已被广泛研究以对抗对抗性攻击。对抗性净化作为一种主要的防御策略，试图通过过滤掉扰动来恢复对抗性实例以清除对应物。然而，在目标和防御模型已知的情况下，许多净化防御都在与白盒攻击作斗争。此外，针对特定攻击的训练过程可能会损害模型对未知攻击的适应性，并且净化操作可能会破坏输入的关键特征。本文提出了随机编码网络（REN），该网络由随机编码去噪器和多样化分类器组成，以增强对抗性净化防御模型的鲁棒性。去噪器的内部部分利用对抗性稀疏编码，通过过滤掉扰动和噪声来净化示例，同时保留输入的关键特征。消噪器外部采用动态随机机制进行随机编码，增强了模型的不确定性。此外，分类器受到多样性约束，以促进随机子模型之间的变化。实验结果表明，REN具有强大的防御泛化能力，可以有效地对抗各种攻击类型和设置的对抗性示例。对于CIFAR-10和SVHN数据集，经过清洁训练的REN对白盒攻击的平均对抗准确率分别为63.26%和59.78%，而经过对抗训练的REN对白盒攻击的平均对抗准确率分别为68.27%和72.39%。当面对未知的攻击场景时，REN比最先进的防御方法更有效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Adversarial purification using random encoding networks

查看原文本刊更多论文

Adversarial purification using random encoding networks

Deep neural networks (DNNs) have revealed vulnerabilities to adversarial examples, which can deceive models with high confidence. This has given rise to serious threats in security-critical domains. Adversarial defense methods have been extensively studied to counter adversarial attacks. Adversarial purification, as a major defense strategy, attempts to recover adversarial examples to clean counterparts by filtering out perturbations. However, many purification defenses struggle against white-box attacks where the target and defense models are known. Additionally, the training processes against specific attacks can compromise models’ adaptability to unknown attacks, and purification operations may destroy key features of inputs. In this paper, we propose the random encoding network (REN), which consists of a random encoding denoiser and a diverse classifier to enhance the robustness of adversarial purification defense models. The internal part of the denoiser leverages adversarial sparse coding to purify examples by filtering out perturbations and noise as much as possible while preserving critical features of inputs. The external part of the denoiser employs a dynamic random mechanism to implement random encoding, thereby enhancing the models’ uncertainty. Moreover, the classifier is subjected to a diversity constraint to promote variation among random sub-models. Experimental results demonstrate that REN exhibits strong defensive generalization capabilities, effectively countering adversarial examples across diverse attack types and settings. For the CIFAR-10 and SVHN datasets, the clean-trained REN achieves average adversarial accuracies of 63.26% and 59.78% against white-box attacks, while the adversarial-trained REN achieves 68.27% and 72.39%, respectively. When faced with unknown attack scenarios, REN is more effective than state-of-the-art defense methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Soft Computing 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

6.90%

发文量

874

审稿时长

10.9 months

期刊介绍： Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.