攻击即防御:使用鲁棒性描述对抗性示例

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis Pub Date : 2021-03-13 DOI:10.1145/3460319.3464822

Zhe Zhao, Guangke Chen, Jingyi Wang, Yiwei Yang, Fu Song, Jun Sun

{"title":"攻击即防御:使用鲁棒性描述对抗性示例","authors":"Zhe Zhao, Guangke Chen, Jingyi Wang, Yiwei Yang, Fu Song, Jun Sun","doi":"10.1145/3460319.3464822","DOIUrl":null,"url":null,"abstract":"As a new programming paradigm, deep learning has expanded its application to many real-world problems. At the same time, deep learning based software are found to be vulnerable to adversarial attacks. Though various defense mechanisms have been proposed to improve robustness of deep learning software, many of them are ineffective against adaptive attacks. In this work, we propose a novel characterization to distinguish adversarial examples from benign ones based on the observation that adversarial examples are significantly less robust than benign ones. As existing robustness measurement does not scale to large networks, we propose a novel defense framework, named attack as defense (A2D), to detect adversarial examples by effectively evaluating an example’s robustness. A2D uses the cost of attacking an input for robustness evaluation and identifies those less robust examples as adversarial since less robust examples are easier to attack. Extensive experiment results on MNIST, CIFAR10 and ImageNet show that A2D is more effective than recent promising approaches. We also evaluate our defense against potential adaptive attacks and show that A2D is effective in defending carefully designed adaptive attacks, e.g., the attack success rate drops to 0% on CIFAR10.","PeriodicalId":188008,"journal":{"name":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Attack as defense: characterizing adversarial examples using robustness\",\"authors\":\"Zhe Zhao, Guangke Chen, Jingyi Wang, Yiwei Yang, Fu Song, Jun Sun\",\"doi\":\"10.1145/3460319.3464822\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a new programming paradigm, deep learning has expanded its application to many real-world problems. At the same time, deep learning based software are found to be vulnerable to adversarial attacks. Though various defense mechanisms have been proposed to improve robustness of deep learning software, many of them are ineffective against adaptive attacks. In this work, we propose a novel characterization to distinguish adversarial examples from benign ones based on the observation that adversarial examples are significantly less robust than benign ones. As existing robustness measurement does not scale to large networks, we propose a novel defense framework, named attack as defense (A2D), to detect adversarial examples by effectively evaluating an example’s robustness. A2D uses the cost of attacking an input for robustness evaluation and identifies those less robust examples as adversarial since less robust examples are easier to attack. Extensive experiment results on MNIST, CIFAR10 and ImageNet show that A2D is more effective than recent promising approaches. We also evaluate our defense against potential adaptive attacks and show that A2D is effective in defending carefully designed adaptive attacks, e.g., the attack success rate drops to 0% on CIFAR10.\",\"PeriodicalId\":188008,\"journal\":{\"name\":\"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3460319.3464822\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3460319.3464822","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

摘要

作为一种新的编程范式，深度学习已经扩展到许多现实问题的应用。与此同时，人们发现基于深度学习的软件容易受到对抗性攻击。虽然已经提出了各种防御机制来提高深度学习软件的鲁棒性，但其中许多机制对自适应攻击无效。在这项工作中，我们提出了一种新的特征来区分对抗样本和良性样本，这是基于对对抗样本的鲁棒性明显低于良性样本的观察。由于现有的鲁棒性测量不能扩展到大型网络，我们提出了一种新的防御框架，称为攻击即防御(A2D)，通过有效评估示例的鲁棒性来检测对抗示例。A2D使用攻击输入的代价进行鲁棒性评估，并将那些不太鲁棒的示例识别为对抗性的，因为不太鲁棒的示例更容易被攻击。在MNIST, CIFAR10和ImageNet上的大量实验结果表明，A2D比最近有前途的方法更有效。我们还评估了我们对潜在自适应攻击的防御，并表明A2D在防御精心设计的自适应攻击方面是有效的，例如，攻击成功率在CIFAR10上降至0%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Attack as defense: characterizing adversarial examples using robustness

As a new programming paradigm, deep learning has expanded its application to many real-world problems. At the same time, deep learning based software are found to be vulnerable to adversarial attacks. Though various defense mechanisms have been proposed to improve robustness of deep learning software, many of them are ineffective against adaptive attacks. In this work, we propose a novel characterization to distinguish adversarial examples from benign ones based on the observation that adversarial examples are significantly less robust than benign ones. As existing robustness measurement does not scale to large networks, we propose a novel defense framework, named attack as defense (A2D), to detect adversarial examples by effectively evaluating an example’s robustness. A2D uses the cost of attacking an input for robustness evaluation and identifies those less robust examples as adversarial since less robust examples are easier to attack. Extensive experiment results on MNIST, CIFAR10 and ImageNet show that A2D is more effective than recent promising approaches. We also evaluate our defense against potential adaptive attacks and show that A2D is effective in defending carefully designed adaptive attacks, e.g., the attack success rate drops to 0% on CIFAR10.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

自引率

0.00%

发文量