Jujutsu: A Two-stage Defense against Adversarial Patch Attacks on Deep Neural Networks

Zitao Chen, Pritam Dash, K. Pattabiraman
{"title":"Jujutsu: A Two-stage Defense against Adversarial Patch Attacks on Deep Neural Networks","authors":"Zitao Chen, Pritam Dash, K. Pattabiraman","doi":"10.1145/3579856.3582816","DOIUrl":null,"url":null,"abstract":"Adversarial patch attacks create adversarial examples by injecting arbitrary distortions within a bounded region of the input to fool deep neural networks (DNNs). These attacks are robust (i.e., physically-realizable) and universally malicious, and hence represent a severe security threat to real-world DNN-based systems. We propose Jujutsu, a two-stage technique to detect and mitigate robust and universal adversarial patch attacks. We first observe that adversarial patches are crafted as localized features that yield large influence on the prediction output, and continue to dominate the prediction on any input. Jujutsu leverages this observation for accurate attack detection with low false positives. Patch attacks corrupt only a localized region of the input, while the majority of the input remains unperturbed. Therefore, Jujutsu leverages generative adversarial networks (GAN) to perform localized attack recovery by synthesizing the semantic contents of the input that are corrupted by the attacks, and reconstructs a “clean” input for correct prediction. We evaluate Jujutsu on four diverse datasets spanning 8 different DNN models, and find that it achieves superior performance and significantly outperforms four existing defenses. We further evaluate Jujutsu against physical-world attacks, as well as adaptive attacks.","PeriodicalId":156082,"journal":{"name":"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security","volume":"242 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3579856.3582816","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Adversarial patch attacks create adversarial examples by injecting arbitrary distortions within a bounded region of the input to fool deep neural networks (DNNs). These attacks are robust (i.e., physically-realizable) and universally malicious, and hence represent a severe security threat to real-world DNN-based systems. We propose Jujutsu, a two-stage technique to detect and mitigate robust and universal adversarial patch attacks. We first observe that adversarial patches are crafted as localized features that yield large influence on the prediction output, and continue to dominate the prediction on any input. Jujutsu leverages this observation for accurate attack detection with low false positives. Patch attacks corrupt only a localized region of the input, while the majority of the input remains unperturbed. Therefore, Jujutsu leverages generative adversarial networks (GAN) to perform localized attack recovery by synthesizing the semantic contents of the input that are corrupted by the attacks, and reconstructs a “clean” input for correct prediction. We evaluate Jujutsu on four diverse datasets spanning 8 different DNN models, and find that it achieves superior performance and significantly outperforms four existing defenses. We further evaluate Jujutsu against physical-world attacks, as well as adaptive attacks.
柔术:深度神经网络对抗性补丁攻击的两阶段防御
对抗性补丁攻击通过在输入的有界区域内注入任意扭曲来创建对抗性示例,以欺骗深度神经网络(dnn)。这些攻击是强大的(即,物理上可实现的)和普遍恶意的,因此对现实世界中基于dnn的系统构成了严重的安全威胁。我们提出了一种两阶段技术,用于检测和减轻鲁棒和通用的对抗性补丁攻击。我们首先观察到,对抗性补丁被制作为局部特征,对预测输出产生很大的影响,并在任何输入上继续主导预测。柔术利用这种观察进行准确的攻击检测,误报率低。补丁攻击只破坏输入的局部区域,而大部分输入保持不受干扰。因此,Jujutsu利用生成式对抗网络(GAN)通过合成被攻击破坏的输入的语义内容来执行局部攻击恢复,并重建一个“干净”的输入以进行正确的预测。我们在跨越8种不同DNN模型的4个不同数据集上对Jujutsu进行了评估,发现它达到了卓越的性能,并且显著优于4种现有的防御。我们进一步评估柔术对抗物理世界的攻击,以及自适应攻击。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信