{"title":"探索对抗性图像的转换空间","authors":"Jiyu Chen, David Wang, Hao Chen","doi":"10.1145/3374664.3375728","DOIUrl":null,"url":null,"abstract":"Deep learning models are vulnerable to adversarial examples. Most of current adversarial attacks add pixel-wise perturbations restricted to some \\(L^p\\)-norm, and defense models are evaluated also on adversarial examples restricted inside \\(L^p\\)-norm balls. However, we wish to explore adversarial examples exist beyond \\(L^p\\)-norm balls and their implications for attacks and defenses. In this paper, we focus on adversarial images generated by transformations. We start with color transformation and propose two gradient-based attacks. Since \\(L^p\\)-norm is inappropriate for measuring image quality in the transformation space, we use the similarity between transformations and the Structural Similarity Index. Next, we explore a larger transformation space consisting of combinations of color and affine transformations. We evaluate our transformation attacks on three data sets --- CIFAR10, SVHN, and ImageNet --- and their corresponding models. Finally, we perform retraining defenses to evaluate the strength of our attacks. The results show that transformation attacks are powerful. They find high-quality adversarial images that have higher transferability and misclassification rates than C&W's \\(L^p \\) attacks, especially at high confidence levels. They are also significantly harder to defend against by retraining than C&W's \\(L^p \\) attacks. More importantly, exploring different attack spaces makes it more challenging to train a universally robust model.","PeriodicalId":171521,"journal":{"name":"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy","volume":"912 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Explore the Transformation Space for Adversarial Images\",\"authors\":\"Jiyu Chen, David Wang, Hao Chen\",\"doi\":\"10.1145/3374664.3375728\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning models are vulnerable to adversarial examples. Most of current adversarial attacks add pixel-wise perturbations restricted to some \\\\(L^p\\\\)-norm, and defense models are evaluated also on adversarial examples restricted inside \\\\(L^p\\\\)-norm balls. However, we wish to explore adversarial examples exist beyond \\\\(L^p\\\\)-norm balls and their implications for attacks and defenses. In this paper, we focus on adversarial images generated by transformations. We start with color transformation and propose two gradient-based attacks. Since \\\\(L^p\\\\)-norm is inappropriate for measuring image quality in the transformation space, we use the similarity between transformations and the Structural Similarity Index. Next, we explore a larger transformation space consisting of combinations of color and affine transformations. We evaluate our transformation attacks on three data sets --- CIFAR10, SVHN, and ImageNet --- and their corresponding models. Finally, we perform retraining defenses to evaluate the strength of our attacks. The results show that transformation attacks are powerful. They find high-quality adversarial images that have higher transferability and misclassification rates than C&W's \\\\(L^p \\\\) attacks, especially at high confidence levels. They are also significantly harder to defend against by retraining than C&W's \\\\(L^p \\\\) attacks. More importantly, exploring different attack spaces makes it more challenging to train a universally robust model.\",\"PeriodicalId\":171521,\"journal\":{\"name\":\"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy\",\"volume\":\"912 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-03-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3374664.3375728\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3374664.3375728","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Explore the Transformation Space for Adversarial Images
Deep learning models are vulnerable to adversarial examples. Most of current adversarial attacks add pixel-wise perturbations restricted to some \(L^p\)-norm, and defense models are evaluated also on adversarial examples restricted inside \(L^p\)-norm balls. However, we wish to explore adversarial examples exist beyond \(L^p\)-norm balls and their implications for attacks and defenses. In this paper, we focus on adversarial images generated by transformations. We start with color transformation and propose two gradient-based attacks. Since \(L^p\)-norm is inappropriate for measuring image quality in the transformation space, we use the similarity between transformations and the Structural Similarity Index. Next, we explore a larger transformation space consisting of combinations of color and affine transformations. We evaluate our transformation attacks on three data sets --- CIFAR10, SVHN, and ImageNet --- and their corresponding models. Finally, we perform retraining defenses to evaluate the strength of our attacks. The results show that transformation attacks are powerful. They find high-quality adversarial images that have higher transferability and misclassification rates than C&W's \(L^p \) attacks, especially at high confidence levels. They are also significantly harder to defend against by retraining than C&W's \(L^p \) attacks. More importantly, exploring different attack spaces makes it more challenging to train a universally robust model.