Explore the Transformation Space for Adversarial Images

Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy Pub Date : 2020-03-16 DOI:10.1145/3374664.3375728

Jiyu Chen, David Wang, Hao Chen

{"title":"Explore the Transformation Space for Adversarial Images","authors":"Jiyu Chen, David Wang, Hao Chen","doi":"10.1145/3374664.3375728","DOIUrl":null,"url":null,"abstract":"Deep learning models are vulnerable to adversarial examples. Most of current adversarial attacks add pixel-wise perturbations restricted to some \\(L^p\\)-norm, and defense models are evaluated also on adversarial examples restricted inside \\(L^p\\)-norm balls. However, we wish to explore adversarial examples exist beyond \\(L^p\\)-norm balls and their implications for attacks and defenses. In this paper, we focus on adversarial images generated by transformations. We start with color transformation and propose two gradient-based attacks. Since \\(L^p\\)-norm is inappropriate for measuring image quality in the transformation space, we use the similarity between transformations and the Structural Similarity Index. Next, we explore a larger transformation space consisting of combinations of color and affine transformations. We evaluate our transformation attacks on three data sets --- CIFAR10, SVHN, and ImageNet --- and their corresponding models. Finally, we perform retraining defenses to evaluate the strength of our attacks. The results show that transformation attacks are powerful. They find high-quality adversarial images that have higher transferability and misclassification rates than C&W's \\(L^p \\) attacks, especially at high confidence levels. They are also significantly harder to defend against by retraining than C&W's \\(L^p \\) attacks. More importantly, exploring different attack spaces makes it more challenging to train a universally robust model.","PeriodicalId":171521,"journal":{"name":"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy","volume":"912 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3374664.3375728","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Deep learning models are vulnerable to adversarial examples. Most of current adversarial attacks add pixel-wise perturbations restricted to some \(L^p\)-norm, and defense models are evaluated also on adversarial examples restricted inside \(L^p\)-norm balls. However, we wish to explore adversarial examples exist beyond \(L^p\)-norm balls and their implications for attacks and defenses. In this paper, we focus on adversarial images generated by transformations. We start with color transformation and propose two gradient-based attacks. Since \(L^p\)-norm is inappropriate for measuring image quality in the transformation space, we use the similarity between transformations and the Structural Similarity Index. Next, we explore a larger transformation space consisting of combinations of color and affine transformations. We evaluate our transformation attacks on three data sets --- CIFAR10, SVHN, and ImageNet --- and their corresponding models. Finally, we perform retraining defenses to evaluate the strength of our attacks. The results show that transformation attacks are powerful. They find high-quality adversarial images that have higher transferability and misclassification rates than C&W's \(L^p \) attacks, especially at high confidence levels. They are also significantly harder to defend against by retraining than C&W's \(L^p \) attacks. More importantly, exploring different attack spaces makes it more challenging to train a universally robust model.

查看原文本刊更多论文

探索对抗性图像的转换空间

深度学习模型容易受到对抗性例子的影响。目前的大多数对抗性攻击都添加了限制在\(L^p\) -范数内的像素摄动，并且防御模型也在\(L^p\) -范数球内的对抗性示例上进行评估。然而，我们希望探索除了\(L^p\) -规范球之外存在的对抗性示例及其对攻击和防御的影响。在本文中，我们关注的是由变换产生的对抗图像。我们从颜色变换开始，提出了两种基于梯度的攻击。由于\(L^p\) -范数不适合测量变换空间中的图像质量，我们使用变换之间的相似性和结构相似性指数。接下来，我们探索一个由颜色和仿射变换组合组成的更大的变换空间。我们对三个数据集(CIFAR10、SVHN和ImageNet)及其相应的模型评估了我们的转换攻击。最后，我们执行再训练防御来评估我们攻击的强度。结果表明，转换攻击是强大的。他们发现高质量的对抗性图像比C＆W的\(L^p \)攻击具有更高的可转移性和错误分类率，特别是在高置信度水平下。它们也比C＆W的\(L^p \)攻击更难通过再训练来防御。更重要的是，探索不同的攻击空间使得训练一个普遍健壮的模型更具挑战性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy

自引率

0.00%

发文量