{"title":"Improving transferability of adversarial examples via statistical attribution-based attacks","authors":"Hegui Zhu , Yanmeng Jia , Yue Yan , Ze Yang","doi":"10.1016/j.neunet.2025.107341","DOIUrl":null,"url":null,"abstract":"<div><div>Adversarial attacks are significant in uncovering vulnerabilities and assessing the robustness of deep neural networks (DNNs), offering profound insights into their internal mechanisms. Feature-level attacks, a potent approach, craft adversarial examples by extensively corrupting the intermediate-layer features of the source model during each iteration. However, it often has imprecise metrics to assess the significance of features and may impose constraints on the transferability of adversarial examples. To address these issues, this paper introduces the Statistical Attribution-based Attack (SAA) method, which emphasizes finding feature importance representations and refining optimization objectives, thereby achieving stronger attack performance. To calculate the Comprehensive Gradient for more accurate feature representation, we introduce the Region-wise Feature Disturbance and Gradient Information Aggregation, which can effectively disrupt the model’s attention focus areas. Subsequently, a statistical attribution-based approach is employed, leveraging the average feature information across layers to provide a more advantageous optimization objective. Experiments have validated the superiority of this method. Specifically, SAA improves the attack success rate by 9.3% compared with the second-best method. When combined with input transformation methods, it achieves an average success rate of 79.2% against eight leading defense models.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107341"},"PeriodicalIF":6.0000,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025002205","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Adversarial attacks are significant in uncovering vulnerabilities and assessing the robustness of deep neural networks (DNNs), offering profound insights into their internal mechanisms. Feature-level attacks, a potent approach, craft adversarial examples by extensively corrupting the intermediate-layer features of the source model during each iteration. However, it often has imprecise metrics to assess the significance of features and may impose constraints on the transferability of adversarial examples. To address these issues, this paper introduces the Statistical Attribution-based Attack (SAA) method, which emphasizes finding feature importance representations and refining optimization objectives, thereby achieving stronger attack performance. To calculate the Comprehensive Gradient for more accurate feature representation, we introduce the Region-wise Feature Disturbance and Gradient Information Aggregation, which can effectively disrupt the model’s attention focus areas. Subsequently, a statistical attribution-based approach is employed, leveraging the average feature information across layers to provide a more advantageous optimization objective. Experiments have validated the superiority of this method. Specifically, SAA improves the attack success rate by 9.3% compared with the second-best method. When combined with input transformation methods, it achieves an average success rate of 79.2% against eight leading defense models.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.