M3C: Resist Agnostic Attacks by Mitigating Consistent Class Confusion Prior.

IF 18.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Xiaowei Fu,Fuxiang Huang,Guoyin Wang,Xinbo Gao,Lei Zhang
{"title":"M3C: Resist Agnostic Attacks by Mitigating Consistent Class Confusion Prior.","authors":"Xiaowei Fu,Fuxiang Huang,Guoyin Wang,Xinbo Gao,Lei Zhang","doi":"10.1109/tpami.2025.3614495","DOIUrl":null,"url":null,"abstract":"Adversarial attack is a major obstacle to the deployment of deep neural networks (DNNs) for security-sensitive applications. To address these adversarial perturbations, various adversarial defense strategies have been developed, with Adversarial Training (AT) being one of the most effective methods to protect neural networks from adversarial attacks. However, existing AT methods struggle against training-agnostic attacks due to their limited generalizability. This suggests that the AT models lack a unified perspective for various attacks to conduct universal defense. This paper sheds light on a generalizable prior under various attacks: consistent class confusion (3C), i.e., an AT classifier often confuses the predictions between correct and ambiguous classes in a highly similar pattern among diverse attacks. Relying on this latent prior as a bridge between seen and agnostic attacks, we propose a more generalized AT model by mitigating consistent class confusion (M3C) to resist training-agnostic attacks. Specifically, we optimize an Adversarial Confusion Loss (ACL), which is weighted by uncertainty, to distinguish the most confused classes and encourage the AT model to focus on these confused samples. To suppress malignant features affecting correct predictions and producing significant class confusion, we propose a Gradient-Aware Attention (GAA) mechanism to enhance the classification confidence of correct classes and eliminate class confusion. Experiments on multiple benchmarks and network frameworks demonstrate that our M3C model significantly improves the generalization of AT robustness against agnostic attacks. The finding of the 3C prior reveals the potential and possibility for defending against a wide range of attacks, and provides a new perspective to overcome such challenge in this field.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"23 1","pages":""},"PeriodicalIF":18.6000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tpami.2025.3614495","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Adversarial attack is a major obstacle to the deployment of deep neural networks (DNNs) for security-sensitive applications. To address these adversarial perturbations, various adversarial defense strategies have been developed, with Adversarial Training (AT) being one of the most effective methods to protect neural networks from adversarial attacks. However, existing AT methods struggle against training-agnostic attacks due to their limited generalizability. This suggests that the AT models lack a unified perspective for various attacks to conduct universal defense. This paper sheds light on a generalizable prior under various attacks: consistent class confusion (3C), i.e., an AT classifier often confuses the predictions between correct and ambiguous classes in a highly similar pattern among diverse attacks. Relying on this latent prior as a bridge between seen and agnostic attacks, we propose a more generalized AT model by mitigating consistent class confusion (M3C) to resist training-agnostic attacks. Specifically, we optimize an Adversarial Confusion Loss (ACL), which is weighted by uncertainty, to distinguish the most confused classes and encourage the AT model to focus on these confused samples. To suppress malignant features affecting correct predictions and producing significant class confusion, we propose a Gradient-Aware Attention (GAA) mechanism to enhance the classification confidence of correct classes and eliminate class confusion. Experiments on multiple benchmarks and network frameworks demonstrate that our M3C model significantly improves the generalization of AT robustness against agnostic attacks. The finding of the 3C prior reveals the potential and possibility for defending against a wide range of attacks, and provides a new perspective to overcome such challenge in this field.
M3C:通过减轻先前的一致类混淆来抵抗不可知论攻击。
对抗性攻击是深度神经网络(dnn)在安全敏感型应用中部署的主要障碍。为了解决这些对抗性干扰,已经开发了各种对抗性防御策略,对抗性训练(AT)是保护神经网络免受对抗性攻击的最有效方法之一。然而,现有的AT方法由于其有限的泛化性而与训练不可知论攻击作斗争。这说明AT模型对各种攻击缺乏统一的视角,无法进行通用防御。本文阐明了各种攻击下的可推广先验:一致类混淆(3C),即AT分类器经常在各种攻击中以高度相似的模式混淆正确类和模糊类之间的预测。依靠这种潜在先验作为可见攻击和不可知论攻击之间的桥梁,我们提出了一个更广义的AT模型,通过减轻一致类混淆(M3C)来抵抗训练不可知论攻击。具体来说,我们优化了一个对抗混淆损失(ACL),它是由不确定性加权的,以区分最混乱的类别,并鼓励AT模型关注这些混乱的样本。为了抑制影响正确预测和产生显著类混淆的恶性特征,我们提出了一种梯度感知注意(Gradient-Aware Attention, GAA)机制来增强正确类的分类置信度,消除类混淆。在多个基准测试和网络框架上的实验表明,我们的M3C模型显著提高了AT对不可知攻击的鲁棒性泛化。3C先验的发现揭示了防御大范围攻击的潜力和可能性,并为克服这一领域的挑战提供了新的视角。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
28.40
自引率
3.00%
发文量
885
审稿时长
8.5 months
期刊介绍: The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信