{"title":"Explore the Transformation Space for Adversarial Images","authors":"Jiyu Chen, David Wang, Hao Chen","doi":"10.1145/3374664.3375728","DOIUrl":null,"url":null,"abstract":"Deep learning models are vulnerable to adversarial examples. Most of current adversarial attacks add pixel-wise perturbations restricted to some \\(L^p\\)-norm, and defense models are evaluated also on adversarial examples restricted inside \\(L^p\\)-norm balls. However, we wish to explore adversarial examples exist beyond \\(L^p\\)-norm balls and their implications for attacks and defenses. In this paper, we focus on adversarial images generated by transformations. We start with color transformation and propose two gradient-based attacks. Since \\(L^p\\)-norm is inappropriate for measuring image quality in the transformation space, we use the similarity between transformations and the Structural Similarity Index. Next, we explore a larger transformation space consisting of combinations of color and affine transformations. We evaluate our transformation attacks on three data sets --- CIFAR10, SVHN, and ImageNet --- and their corresponding models. Finally, we perform retraining defenses to evaluate the strength of our attacks. The results show that transformation attacks are powerful. They find high-quality adversarial images that have higher transferability and misclassification rates than C&W's \\(L^p \\) attacks, especially at high confidence levels. They are also significantly harder to defend against by retraining than C&W's \\(L^p \\) attacks. More importantly, exploring different attack spaces makes it more challenging to train a universally robust model.","PeriodicalId":171521,"journal":{"name":"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy","volume":"912 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3374664.3375728","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
Deep learning models are vulnerable to adversarial examples. Most of current adversarial attacks add pixel-wise perturbations restricted to some \(L^p\)-norm, and defense models are evaluated also on adversarial examples restricted inside \(L^p\)-norm balls. However, we wish to explore adversarial examples exist beyond \(L^p\)-norm balls and their implications for attacks and defenses. In this paper, we focus on adversarial images generated by transformations. We start with color transformation and propose two gradient-based attacks. Since \(L^p\)-norm is inappropriate for measuring image quality in the transformation space, we use the similarity between transformations and the Structural Similarity Index. Next, we explore a larger transformation space consisting of combinations of color and affine transformations. We evaluate our transformation attacks on three data sets --- CIFAR10, SVHN, and ImageNet --- and their corresponding models. Finally, we perform retraining defenses to evaluate the strength of our attacks. The results show that transformation attacks are powerful. They find high-quality adversarial images that have higher transferability and misclassification rates than C&W's \(L^p \) attacks, especially at high confidence levels. They are also significantly harder to defend against by retraining than C&W's \(L^p \) attacks. More importantly, exploring different attack spaces makes it more challenging to train a universally robust model.