Improving adversarial transferability via semantic-style joint expectation perturbations

IF 7.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Pattern Recognition Pub Date : 2025-09-25 DOI:10.1016/j.patcog.2025.112474

Zhi Lin , Bingwen Wang , Xixi Wang , Yu Zhang , Xiao Wang , Kang Deng , Anjie Peng , Jin Tang , Xing Yang

{"title":"Improving adversarial transferability via semantic-style joint expectation perturbations","authors":"Zhi Lin , Bingwen Wang , Xixi Wang , Yu Zhang , Xiao Wang , Kang Deng , Anjie Peng , Jin Tang , Xing Yang","doi":"10.1016/j.patcog.2025.112474","DOIUrl":null,"url":null,"abstract":"<div><div>Style and content information, which are model-independent inherent properties of an image, serve as crucial information that deep neural networks depend on for classification tasks. However, most existing gradient-based attacks mainly distort content-related information through semantic distortion of the model’s final output, neglecting the role of style information. To fully distort the inherent intrinsic information of the image, this paper proposes Semantic-Style joint Expectation Perturbations (SSEPs). Specifically, we first establish a style loss based on the kernel function from the feature space of the surrogate model and inject it into gradient-based attacks to form a Semantics-Style joint Loss (SSL) for generating joint perturbations. Subsequently, we use gradient normalization and the proposed dynamic gradient decomposition scheme to address the problems of multi-objective gradient magnitude differences and gradient conflicts that occur in SSL during optimization. Finally, we generate SSEPs by motivating the maximization of the expected loss, thereby enhancing the transferability of Adversarial Examples (AEs). On the ImageNet sub-dataset, extensive experiments show that AEs covered with SSEPs have high transferability. Compared to the baseline attack (MI-FGSM), our method achieves at least a 14 % and 5 % higher attack success rate for normally trained models and defense models, respectively. Compared with other classic and advanced gradient-based attacks and feature-level attacks, our method still has advantages in attack performance. Our code is available at: <span><span>https://github.com/OUTOFTEN/TransferAttack-ssep</span><svg><path></path></svg></span></div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112474"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325011379","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Style and content information, which are model-independent inherent properties of an image, serve as crucial information that deep neural networks depend on for classification tasks. However, most existing gradient-based attacks mainly distort content-related information through semantic distortion of the model’s final output, neglecting the role of style information. To fully distort the inherent intrinsic information of the image, this paper proposes Semantic-Style joint Expectation Perturbations (SSEPs). Specifically, we first establish a style loss based on the kernel function from the feature space of the surrogate model and inject it into gradient-based attacks to form a Semantics-Style joint Loss (SSL) for generating joint perturbations. Subsequently, we use gradient normalization and the proposed dynamic gradient decomposition scheme to address the problems of multi-objective gradient magnitude differences and gradient conflicts that occur in SSL during optimization. Finally, we generate SSEPs by motivating the maximization of the expected loss, thereby enhancing the transferability of Adversarial Examples (AEs). On the ImageNet sub-dataset, extensive experiments show that AEs covered with SSEPs have high transferability. Compared to the baseline attack (MI-FGSM), our method achieves at least a 14 % and 5 % higher attack success rate for normally trained models and defense models, respectively. Compared with other classic and advanced gradient-based attacks and feature-level attacks, our method still has advantages in attack performance. Our code is available at: https://github.com/OUTOFTEN/TransferAttack-ssep

查看原文本刊更多论文

通过语义式联合期望扰动提高对抗可迁移性

风格和内容信息是图像独立于模型的固有属性，是深度神经网络分类任务所依赖的关键信息。然而，大多数现有的基于梯度的攻击主要是通过对模型最终输出的语义扭曲来扭曲与内容相关的信息，而忽略了样式信息的作用。为了充分扭曲图像的固有信息，本文提出了语义式联合期望摄动（ssep）。具体来说，我们首先从代理模型的特征空间中建立基于核函数的风格损失，并将其注入到基于梯度的攻击中，形成语义风格的联合损失（SSL），用于生成联合扰动。随后，我们使用梯度归一化和提出的动态梯度分解方案来解决SSL优化过程中出现的多目标梯度大小差异和梯度冲突问题。最后，我们通过激励期望损失最大化来生成ssep，从而增强对抗性示例（AEs）的可转移性。在ImageNet子数据集上，大量的实验表明，被ssep覆盖的ae具有很高的可移植性。与基线攻击（MI-FGSM）相比，我们的方法在正常训练的模型和防御模型上分别实现了至少14%和5%的攻击成功率。与其他经典和先进的基于梯度的攻击和特征级攻击相比，我们的方法在攻击性能上仍然具有优势。我们的代码可在：https://github.com/OUTOFTEN/TransferAttack-ssep

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.