Attacking Transformers with Feature Diversity Adversarial Perturbation

ArXiv Pub Date : 2024-03-10 DOI:10.1609/aaai.v38i3.27947

Chenxing Gao, Hang Zhou, Junqing Yu, Yuteng Ye, Jiale Cai, Junle Wang, Wei Yang

引用次数: 0

Abstract

Understanding the mechanisms behind Vision Transformer (ViT), particularly its vulnerability to adversarial perturbations, is crucial for addressing challenges in its real-world applications. Existing ViT adversarial attackers rely on labels to calculate the gradient for perturbation, and exhibit low transferability to other structures and tasks. In this paper, we present a label-free white-box attack approach for ViT-based models that exhibits strong transferability to various black-box models, including most ViT variants, CNNs, and MLPs, even for models developed for other modalities. Our inspiration comes from the feature collapse phenomenon in ViTs, where the critical attention mechanism overly depends on the low-frequency component of features, causing the features in middle-to-end layers to become increasingly similar and eventually collapse. We propose the feature diversity attacker to naturally accelerate this process and achieve remarkable performance and transferability.

查看原文本刊更多论文

利用特征多样性逆向扰动攻击变压器

了解视觉变换器（ViT）背后的机制，特别是它在对抗性扰动面前的脆弱性，对于应对其在现实世界应用中的挑战至关重要。现有的 ViT 对抗性攻击依赖于标签来计算扰动梯度，对其他结构和任务的可移植性较低。在本文中，我们针对基于 ViT 的模型提出了一种无标签白箱攻击方法，这种方法对各种黑箱模型（包括大多数 ViT 变体、CNN 和 MLP）具有很强的可移植性，甚至对为其他模态开发的模型也是如此。我们的灵感来自于 ViT 中的特征坍塌现象，即临界注意力机制过度依赖于特征的低频分量，导致中层到末层的特征越来越相似，最终坍塌。我们提出了特征多样性攻击器，以自然地加速这一过程，并实现显著的性能和可移植性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ArXiv

自引率

0.00%

发文量