通过方向调整提高对抗性样例的可转移性

Inf. Sci. Pub Date : 2023-03-27 DOI:10.48550/arXiv.2303.15109

Xiangyuan Yang, Jie Lin, Han Zhang, Xinyu Yang, Peng Zhao

{"title":"通过方向调整提高对抗性样例的可转移性","authors":"Xiangyuan Yang, Jie Lin, Han Zhang, Xinyu Yang, Peng Zhao","doi":"10.48550/arXiv.2303.15109","DOIUrl":null,"url":null,"abstract":"In the transfer-based adversarial attacks, adversarial examples are only generated by the surrogate models and achieve effective perturbation in the victim models. Although considerable efforts have been developed on improving the transferability of adversarial examples generated by transfer-based adversarial attacks, our investigation found that, the big deviation between the actual and steepest update directions of the current transfer-based adversarial attacks is caused by the large update step length, resulting in the generated adversarial examples can not converge well. However, directly reducing the update step length will lead to serious update oscillation so that the generated adversarial examples also can not achieve great transferability to the victim models. To address these issues, a novel transfer-based attack, namely direction tuning attack, is proposed to not only decrease the update deviation in the large step length, but also mitigate the update oscillation in the small sampling step length, thereby making the generated adversarial examples converge well to achieve great transferability on victim models. In addition, a network pruning method is proposed to smooth the decision boundary, thereby further decreasing the update oscillation and enhancing the transferability of the generated adversarial examples. The experiment results on ImageNet demonstrate that the average attack success rate (ASR) of the adversarial examples generated by our method can be improved from 87.9\\% to 94.5\\% on five victim models without defenses, and from 69.1\\% to 76.2\\% on eight advanced defense methods, in comparison with that of latest gradient-based attacks.","PeriodicalId":13641,"journal":{"name":"Inf. Sci.","volume":"27 1","pages":"119491"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improving the Transferability of Adversarial Examples via Direction Tuning\",\"authors\":\"Xiangyuan Yang, Jie Lin, Han Zhang, Xinyu Yang, Peng Zhao\",\"doi\":\"10.48550/arXiv.2303.15109\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the transfer-based adversarial attacks, adversarial examples are only generated by the surrogate models and achieve effective perturbation in the victim models. Although considerable efforts have been developed on improving the transferability of adversarial examples generated by transfer-based adversarial attacks, our investigation found that, the big deviation between the actual and steepest update directions of the current transfer-based adversarial attacks is caused by the large update step length, resulting in the generated adversarial examples can not converge well. However, directly reducing the update step length will lead to serious update oscillation so that the generated adversarial examples also can not achieve great transferability to the victim models. To address these issues, a novel transfer-based attack, namely direction tuning attack, is proposed to not only decrease the update deviation in the large step length, but also mitigate the update oscillation in the small sampling step length, thereby making the generated adversarial examples converge well to achieve great transferability on victim models. In addition, a network pruning method is proposed to smooth the decision boundary, thereby further decreasing the update oscillation and enhancing the transferability of the generated adversarial examples. The experiment results on ImageNet demonstrate that the average attack success rate (ASR) of the adversarial examples generated by our method can be improved from 87.9\\\\% to 94.5\\\\% on five victim models without defenses, and from 69.1\\\\% to 76.2\\\\% on eight advanced defense methods, in comparison with that of latest gradient-based attacks.\",\"PeriodicalId\":13641,\"journal\":{\"name\":\"Inf. Sci.\",\"volume\":\"27 1\",\"pages\":\"119491\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Inf. Sci.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2303.15109\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Inf. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2303.15109","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在基于转移的对抗性攻击中，对抗性示例仅由代理模型生成，并在受害者模型中实现有效扰动。虽然在提高基于迁移的对抗性攻击生成的对抗性示例的可转移性方面已经做出了相当大的努力，但我们的研究发现，目前基于迁移的对抗性攻击的实际更新方向与最陡更新方向之间存在较大的偏差，这是由于更新步长较大，导致生成的对抗性示例不能很好地收敛。然而，直接减小更新步长会导致严重的更新振荡，使得生成的对抗样例也不能很好地转移到受害模型。为了解决这些问题，提出了一种新的基于迁移的攻击方法，即方向调谐攻击，该方法不仅可以减小大步长的更新偏差，而且可以减轻小步长的更新振荡，从而使生成的对抗样例很好地收敛，从而在受害者模型上实现很大的可转移性。此外，提出了一种网络剪枝方法来平滑决策边界，从而进一步降低了更新振荡，增强了生成的对抗样例的可转移性。在ImageNet上的实验结果表明，与最新的基于梯度的攻击相比，本文方法生成的对抗样本在无防御的5种受害者模型上的平均攻击成功率(ASR)从87.9%提高到94.5%，在8种高级防御方法上的平均攻击成功率(ASR)从69.1%提高到76.2%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving the Transferability of Adversarial Examples via Direction Tuning

In the transfer-based adversarial attacks, adversarial examples are only generated by the surrogate models and achieve effective perturbation in the victim models. Although considerable efforts have been developed on improving the transferability of adversarial examples generated by transfer-based adversarial attacks, our investigation found that, the big deviation between the actual and steepest update directions of the current transfer-based adversarial attacks is caused by the large update step length, resulting in the generated adversarial examples can not converge well. However, directly reducing the update step length will lead to serious update oscillation so that the generated adversarial examples also can not achieve great transferability to the victim models. To address these issues, a novel transfer-based attack, namely direction tuning attack, is proposed to not only decrease the update deviation in the large step length, but also mitigate the update oscillation in the small sampling step length, thereby making the generated adversarial examples converge well to achieve great transferability on victim models. In addition, a network pruning method is proposed to smooth the decision boundary, thereby further decreasing the update oscillation and enhancing the transferability of the generated adversarial examples. The experiment results on ImageNet demonstrate that the average attack success rate (ASR) of the adversarial examples generated by our method can be improved from 87.9\% to 94.5\% on five victim models without defenses, and from 69.1\% to 76.2\% on eight advanced defense methods, in comparison with that of latest gradient-based attacks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Inf. Sci.

自引率

0.00%

发文量