Explore the Transformation Space for Adversarial Images

Jiyu Chen, David Wang, Hao Chen
{"title":"Explore the Transformation Space for Adversarial Images","authors":"Jiyu Chen, David Wang, Hao Chen","doi":"10.1145/3374664.3375728","DOIUrl":null,"url":null,"abstract":"Deep learning models are vulnerable to adversarial examples. Most of current adversarial attacks add pixel-wise perturbations restricted to some \\(L^p\\)-norm, and defense models are evaluated also on adversarial examples restricted inside \\(L^p\\)-norm balls. However, we wish to explore adversarial examples exist beyond \\(L^p\\)-norm balls and their implications for attacks and defenses. In this paper, we focus on adversarial images generated by transformations. We start with color transformation and propose two gradient-based attacks. Since \\(L^p\\)-norm is inappropriate for measuring image quality in the transformation space, we use the similarity between transformations and the Structural Similarity Index. Next, we explore a larger transformation space consisting of combinations of color and affine transformations. We evaluate our transformation attacks on three data sets --- CIFAR10, SVHN, and ImageNet --- and their corresponding models. Finally, we perform retraining defenses to evaluate the strength of our attacks. The results show that transformation attacks are powerful. They find high-quality adversarial images that have higher transferability and misclassification rates than C&W's \\(L^p \\) attacks, especially at high confidence levels. They are also significantly harder to defend against by retraining than C&W's \\(L^p \\) attacks. More importantly, exploring different attack spaces makes it more challenging to train a universally robust model.","PeriodicalId":171521,"journal":{"name":"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy","volume":"912 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3374664.3375728","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Deep learning models are vulnerable to adversarial examples. Most of current adversarial attacks add pixel-wise perturbations restricted to some \(L^p\)-norm, and defense models are evaluated also on adversarial examples restricted inside \(L^p\)-norm balls. However, we wish to explore adversarial examples exist beyond \(L^p\)-norm balls and their implications for attacks and defenses. In this paper, we focus on adversarial images generated by transformations. We start with color transformation and propose two gradient-based attacks. Since \(L^p\)-norm is inappropriate for measuring image quality in the transformation space, we use the similarity between transformations and the Structural Similarity Index. Next, we explore a larger transformation space consisting of combinations of color and affine transformations. We evaluate our transformation attacks on three data sets --- CIFAR10, SVHN, and ImageNet --- and their corresponding models. Finally, we perform retraining defenses to evaluate the strength of our attacks. The results show that transformation attacks are powerful. They find high-quality adversarial images that have higher transferability and misclassification rates than C&W's \(L^p \) attacks, especially at high confidence levels. They are also significantly harder to defend against by retraining than C&W's \(L^p \) attacks. More importantly, exploring different attack spaces makes it more challenging to train a universally robust model.
探索对抗性图像的转换空间
深度学习模型容易受到对抗性例子的影响。目前的大多数对抗性攻击都添加了限制在\(L^p\) -范数内的像素摄动,并且防御模型也在\(L^p\) -范数球内的对抗性示例上进行评估。然而,我们希望探索除了\(L^p\) -规范球之外存在的对抗性示例及其对攻击和防御的影响。在本文中,我们关注的是由变换产生的对抗图像。我们从颜色变换开始,提出了两种基于梯度的攻击。由于\(L^p\) -范数不适合测量变换空间中的图像质量,我们使用变换之间的相似性和结构相似性指数。接下来,我们探索一个由颜色和仿射变换组合组成的更大的变换空间。我们对三个数据集(CIFAR10、SVHN和ImageNet)及其相应的模型评估了我们的转换攻击。最后,我们执行再训练防御来评估我们攻击的强度。结果表明,转换攻击是强大的。他们发现高质量的对抗性图像比C&W的\(L^p \)攻击具有更高的可转移性和错误分类率,特别是在高置信度水平下。它们也比C&W的\(L^p \)攻击更难通过再训练来防御。更重要的是,探索不同的攻击空间使得训练一个普遍健壮的模型更具挑战性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信